- 15 Jun, 2022 1 commit
-
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 14 Jun, 2022 10 commits
-
-
Younes Belkada authored
-
Suraj Patil authored
-
Michael Benayoun authored
* Function refactor * Update src/transformers/utils/fx.py Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Hailey Schoelkopf authored
* add new bloom classes * (feat) add bloom classification tests; make style * style: change import in test * add some typehints to bloom classes * merge main into branch * fix: input checking in bloom seq classification * fix tests * change model class tests * fix few tests - more tests should pass - one test left * make token classifier return hidden states * style: make BLOOM typehints consistent Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
younesbelkada <younesbelkada@gmail.com> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
amyeroberts authored
* Swin models call TFSwinMainLayer * Tidy up
-
Sayak Paul authored
* Add note on amy's contribution. Co-authored-by:
Amy Roberts <aeroberts4444@gmail.com> * remove non-tech comment. Co-authored by: Amy Roberts <aeroberts4444@gmail.com> Co-authored-by:
Amy Roberts <aeroberts4444@gmail.com>
-
Shamane Siri authored
* check * update the RAG-end2end with new PL and RAY * removed unwanted comments
-
Patrick von Platen authored
-
jianan-gu authored
* add jit mode option and model wrap * Update src/transformers/training_args.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refine code * Update src/transformers/trainer.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add ut and refine code * code refine * refine code * add inference doc * Update src/transformers/trainer.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * add cpu inference performance doc * Update perf_infer_cpu.mdx * Update perf_infer_cpu.mdx * Update performance.mdx * Update _toctree.yml * refine jit func naming * Update _toctree.yml * Delete perf_infer_gpu_one.mdx * Update perf_infer_cpu.mdx * Update docs/source/en/perf_infer_cpu.mdx Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * add none check before jit * Update docs/source/en/perf_infer_cpu.mdx Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/perf_infer_cpu.mdx Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Stas Bekman <stas@stason.org> Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com>
-
Yih-Dar authored
* Fix doc builder Dockerfile Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 13 Jun, 2022 10 commits
-
-
Daniel Stancl authored
* Initial commit * Make some fixes * Make PT model full forward pass * Drop TF & Flax implementation, fix copies etc * Add Flax model and update some corresponding stuff * Drop some TF things * Update config and flax local attn * Add encoder_attention_type to config * . * Update docs * Do some cleansing * Fix some issues -> make style; add some docs * Fix position_bias + mask addition + Update tests * Fix repo consistency * Fix model consistency by removing flax operation over attn_mask * [WIP] Add PT TGlobal LongT5 * . * [WIP] Add flax tglobal model * [WIP] Update flax model to use the right attention type in the encoder * Fix flax tglobal model forward pass * Make the use of global_relative_attention_bias * Add test suites for TGlobal model * Fix minor bugs, clean code * Fix pt-flax equivalence though not convinced with correctness * Fix LocalAttn implementation to match the original impl. + update READMEs * Few updates * Update: [Flax] improve large model init and loading #16148 * Add ckpt conversion script accoring to #16853 + handle torch device placement * Minor updates to conversion script. * Typo: AutoModelForSeq2SeqLM -> FlaxAutoModelForSeq2SeqLM * gpu support + dtype fix * Apply some suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * * Remove (de)parallelize stuff * Edit shape comments * Update README.md * make fix-copies * Remove caching logic for local & tglobal attention * Apply another batch of suggestions from code review * Add missing checkpoints * Format converting scripts * Drop (de)parallelize links from longT5 mdx * Fix converting script + revert config file change * Revert "Remove caching logic for local & tglobal attention" This reverts commit 2a619828f6ddc3e65bd9bb1725a12b77fa883a46. * Stash caching logic in Flax model * Make side relative bias used always * Drop caching logic in PT model * Return side bias as it was * Drop all remaining model parallel logic * Remove clamp statements * Move test files to the proper place * Update docs with new version of hf-doc-builder * Fix test imports * Make some minor improvements * Add missing checkpoints to docs * Make TGlobal model compatible with torch.onnx.export * Replace some np.ndarray with jnp.ndarray * Fix TGlobal for ONNX conversion + update docs * fix _make_global_fixed_block_ids and masked neg value * update flax model * style and quality * fix imports * remove load_tf_weights_in_longt5 from init and fix copies * add slow test for TGlobal model * typo fix * Drop obsolete is_parallelizable and one warning * Update __init__ files to fix repo-consistency * fix pipeline test * Fix some device placements * [wip]: Update tests -- need to generate summaries to update expected_summary * Fix quality * Update LongT5 model card * Update (slow) summarization tests * make style * rename checkpoitns * finish * fix flax tests Co-authored-by:
phungvanduy <pvduy23@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
patil-suraj <surajp815@gmail.com>
-
haohanchen-yagao authored
* Add FP16 supporot for sagemaker model parallel * minor fix * fix indentation * handle mix precision exception for smmp * minor fix * remove amp implementation on SMMP * remove redundant stuff * reformat trainer * restyling * reformat
-
Wang, Yi authored
* enable cpu distribution training using mpirun *command like * mpirun -n 2 python3 run_qa.py --no_cuda --xpu_backend ccl xxxx *MASTER_ADDR and MASTER_PORT should be set as env *export MASTER_ADDR=127.0.0.1 *export MASTER_PORT=29500 Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> * fix according to the review comment Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> * use accelerate logic for cpu distribution training to set "RANK","LOCAL_RANK","WORLD_SIZE" environment Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com>
-
Bram Vanroy authored
* allow scope from trainer arg * add ray_scope to training args * escape double quotes * make style && quality * attempt to solve doc style issues * splitting up URLs for style * make fixup * Update src/transformers/training_args.py Co-authored-by:
Antoni Baum <antoni.baum@protonmail.com> * make style Co-authored-by:
Antoni Baum <antoni.baum@protonmail.com>
-
Will Frey authored
I'm guessing that the intention was to have the `_no_split_modules` class attribute for `GPTNeoXPreTrainedModel` to be set to `["GPTNeoXLayer"]`, akin to how its set as `["GPTJBlock"]` for `GPTJPreTrainedModel`. If this is incorrect, please feel free to just close the PR. Thanks!
-
Sylvain Gugger authored
* Fix dtype getters * Proper fix for dtype getter * Style and commant * Always use last for consistency * Quality
-
Bram Vanroy authored
-
Saint authored
Co-authored-by:Saint <saint@st-mini.local>
-
Sijun He authored
* wip * rebase * all tests pass * rebase * ready for PR * address comments * fix styles * add require_torch to pipeline test * remove remote image to improve CI consistency * address comments; fix tf/flax tests * address comments; fix tf/flax tests * fix tests; add alias * repo consistency tests * Update src/transformers/pipelines/visual_question_answering.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * address comments * Update src/transformers/pipelines/visual_question_answering.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * merge * Update src/transformers/models/auto/modeling_auto.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * merge Co-authored-by:
Sijun He <sijunhe@Sijuns-MacBook-Pro.local> Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Ayush Mangal authored
-
- 10 Jun, 2022 17 commits
-
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Domenic Rosati authored
-
Patrick von Platen authored
-
Patrick von Platen authored
-
Patrick von Platen authored
* [BigBirdFlaxTests] Make tests slow * up * correct black with new version
-
Loubna Ben Allal authored
- use CodeParrot scores of v1.1 - change evaluation command to use accelerate
-
Simon Brandeis authored
* Raise RepoNotFoundError in case of 401 * Include changes from revert-17646-skip_repo_not_found * Add a comment *
馃拕 Code quality *馃挌 Update `get_from_cache` test *馃挌 Code quality & skip failing test -
Balaji authored
VisibleDeprecationWarning is addressed by specifying dtype=object when creating numpy array. Update code based on review feedback. Undo whitespace changes to tokenization_utils_base.py. Co-authored-by:I like data <ilikedata@nym.hush.com>
-
Sylvain Gugger authored
-
amyeroberts authored
-
Lysandre authored
-
Lysandre authored
-
dependabot[bot] authored
Bumps [cookiecutter](https://github.com/cookiecutter/cookiecutter) from 1.7.2 to 2.1.1. - [Release notes](https://github.com/cookiecutter/cookiecutter/releases) - [Changelog](https://github.com/cookiecutter/cookiecutter/blob/master/HISTORY.md) - [Commits](https://github.com/cookiecutter/cookiecutter/compare/1.7.2...2.1.1 ) --- updated-dependencies: - dependency-name: cookiecutter dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
Alara Dirik authored
* enable crop_center method to handle (W, H, C) images * minor style and comment edits
-
Alara Dirik authored
* move clip image utils to image_utils.py * dont default to square images * fix typo, revert change to test file * edit convert_rgb comments
-
Sylvain Gugger authored
-
Martina Fumanelli authored
* Add Italian translation for autoclass_tutorial.mdx * Fix synthesis Co-authored-by:martina.fumanelli <martina.fumanelli@MBP-di-martinafumanelli.local>
-
- 09 Jun, 2022 2 commits
-
-
Stas Bekman authored
-
mrbean authored
* convert assertion to raised exception in debertav2 * change assert to raise exception in deberta * fix messages
-