"...git@developer.sourcefind.cn:chenpangpang/transformers.git" did not exist on "b0513b013b10939a2b47ab94933c2cca909716a2"
- 19 Apr, 2023 1 commit
-
-
Arthur authored
* initial commit * keys match * update, fix conversion * fixes, inference working * fix * more fixes * more fixes * clean up * more clean up * fix copies and add convext copied layer norm * stash * pretty big upfate * cleaning * more cleaning * fixup stuffs * fix copies * fix iinit * update test removing tokenizer * nits * add pretrained * more nits * remove tracking of pipeline * few fixes * update san and conversion script * fix mask decoder and prompt encoder conversion * fixes * small update * fix order * fix * fix image embeddings * nites * few fixes * fix logits * clean up * fixes boxes inference * v1 AMG * clean up * some clean up * multi points support * amg working * fixup * clean up * readme * update toctree * fix type hint * multiple fixes * fixup * fixes * updates * updates * more tests * few fixes * change to `SamForMaskGeneration` * doc * fixup * fix more tests * multiple fixes * fix CI tests * refactor processor * renamings * draft the pipeline * refactor * fix tests * fix test * few cleanings * fix test * edit pipelien support chunking * udate * add slow tests * fix nit * fixup * fix nit * current chunk pipleine * cast boxes in fp32 * nit * current updates * piepleine works * fixup * clean up config * fix slow tests * fix slow tests * clean up * update doc and pipeline * adds more slow tests * fix slow tests * cleaning * tests pass * add docstring * fix copies * clean up * support batch of images * style * dummy is needed, add tests * fix slow tests * fix CI * update * adds more tests * fixes * fixes * fixup * fixes * few fixes * filter * few fixes * some refactor * touches finales * fix * style * remove pipeline files * fixes nits * revert pipeline changes * fix test * fixup * remove automodel for automatic mask generation * fix failing torch tests * update mdx * revert removal of `MODEL_FOR_AUTOMATIC_MASK_GENERATION_MAPPING` * update sam config based on review Co-authored-by:
amyeroberts <aeroberts4444@gmail.com> Co-authored-by:
sgugger <sylvain.gugger@gmail.com> * update low_resolution_masks -> pred_masks inti ln with layer_norm_eps add_decomposed_rel_pos doc forward doc of SamForMaskGeneration * update processor docstring * remove image processor import empty * update for testing * output vision hidden states + clean recomm also test all iou values * fixup * fixup * remove unused * Update src/transformers/models/sam/modeling_sam.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/sam/image_processing_sam.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * nits * fix * fix CI tests and slow tests * replace with Amy's processor * clearer docstring * add `SamVisionNeck` * refactor - all CI tests should pass * fix broken import on Gcolab * few fixes here and there * fix another bug * fix more bugs * update and merge * correct ckpt * address comments * add tips * revert * fix docstring * replace with `SamModel` * make fixup * add support for bathed images and batch ed points * make fixup this time, really * make fixup again and again * few fixes here and there, this should be the touche finale * Update docs/source/en/model_doc/sam.mdx * fixup * correct checkpoints * correct name * rm unneeded file * add notebook --------- Co-authored-by:
younesbelkada <younesbelkada@gmail.com> Co-authored-by:
amyeroberts <aeroberts4444@gmail.com> Co-authored-by:
sgugger <sylvain.gugger@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
- 18 Apr, 2023 2 commits
-
-
Joao Gante authored
* working mvp * remove breakpoint * fix commit * standardize outputs * tmp commit * tests almost ready * tmp commit * skip a few models * Add streaming; Docs and examples * document limitations * PR commits * Amy PR comments
-
Gabriel Yang authored
docs: ko: fix anchor links for docs (auto_tutorial, training) Co-authored-by:
Hyeonseo Yun <0525_hhgus@naver.com> Co-authored-by:
Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by:
Na Yeon Han <nayeon2.han@gmail.com> Co-authored-by:
Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by:
Jungnerd <46880056+jungnerd@users.noreply.github.com>
-
- 17 Apr, 2023 4 commits
-
-
Sylvain Gugger authored
* Mark auto models as important * Annoying file with bad line endings
-
Wonhyeong Seo authored
docs: ko: tasks/translation.mdx
-
Jungnerd authored
fix: docs: ko: sagemaker anchors and `_toctree.yml` Co-authored-by:
Hyeonseo Yun <0525_hhgus@naver.com> Co-authored-by:
Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by:
Sohyun Sim <96299403+sim-so@users.noreply.github.com> Co-authored-by:
Na Yeon Han <nayeon2.han@gmail.com> Co-authored-by:
Wonhyeong Seo <wonhseo@kakao.com>
-
Na Yeon Han authored
docs: ko: translated `custom_models.mdx` Co-authored-by:
Wonhyeong Seo <wonhseo@kakao.com> Co-authored-by:
Gabriel Yang <gabrielwithhappy@gmail.com> Co-authored-by:
Jungnerd <46880056+jungnerd@users.noreply.github.com>
-
- 14 Apr, 2023 3 commits
-
-
Mayank Agarwal authored
* Fix word_ids hyperlink * Add suggested fix
-
Sohyun Sim authored
* add ko preprocessing * translate preprocessing.mdx to korean * translate preprocessing.mdx * Update preprocessing.mdx Fixed the line 273 as below: 또한, 특징 추출기에 `sampling_rate` 인자를 추가하여 발생할 수 있는 조용한 오류(silent errors)를 더 잘 디버깅하는 것을 권장합니다. * translate Image part * translated preprocess.mdx * Update docs/source/ko/preprocessing.mdx Co-authored-by:
Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx Co-authored-by:
Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx Co-authored-by:
Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx Co-authored-by:
Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx Co-authored-by:
Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx Co-authored-by:
Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx Co-authored-by:
Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx Co-authored-by:
Wonhyeong Seo <wonhseo@kakao.com> * Update docs/source/ko/preprocessing.mdx * Update docs/source/ko/preprocessing.mdx * Update docs/source/ko/preprocessing.mdx * Update docs/source/ko/preprocessing.mdx * Update docs/source/ko/preprocessing.mdx * Update docs/source/ko/preprocessing.mdx * fixed translation --------- Co-authored-by:
Wonhyeong Seo <wonhseo@kakao.com>
-
Hyeonseo Yun authored
* docs: ko: init: tasks/sequence_classification.mdx * docs: ko: revised: change voca in tasks/sequence_classification.mdx * docs: ko: revised: [RE] change voca in tasks/sequence_classification.mdx * docs: ko: revised: spell check and sentence naturally in tasks/sequence_classification.mdx * docs: ko: revised: spell check and consistent vocabulary in tasks/sequence_classification.mdx * docs: ko: revised: Add full stop and change voca in tasks/sequence_classification.mdx * docs: ko: revised: sync first section templates in tasks/sequence_classification.mdx Co-authored-by:
Wonhyeong Seo <wonhseo@kakao.com> * fix: revert use of full-stops to colons * colons are used to emphasize the code block that follows * @0525hhgus @wonhyeongseo docs: ko: revised: sync second section templates in tasks/sequence_classification.mdx Co-Authored-By:
Wonhyeong Seo <wonhseo@kakao.com> * docs: ko: revised: change 'train', 'finetuning' in tasks/sequence_classification.mdx --------- Co-authored-by:
Wonhyeong Seo <wonhseo@kakao.com>
-
- 13 Apr, 2023 3 commits
-
-
Joao Gante authored
-
Gabriel Yang authored
translate training doc to Korean
-
NielsRogge authored
* Add model to doc tests * Remove generate and replace by prepare_inputs_for_generation * More fixes * Remove print statements * Update integration tests * Fix generate * Remove model from auto mapping * Use auto processor * Fix integration tests * Fix test * Add inference code snippet * Remove is_encoder_decoder * Update docs * Remove notebook link
-
- 12 Apr, 2023 4 commits
-
-
ARKA1112 authored
generator(model="openai/whisper-large") always returns error. As the error says the generator expects an input, just like the .flac file above. Even the generator object has no parameters called model. While there are parameters which can be passed to generator like 'batch_size' but to pass a model i believe the the parameter has to be passed while instantiating the pipeline and not as a parameter to the instance. I believe the correct term should be: generator = pipeline(model="openai/whisper-large", device=0)
-
Younes Belkada authored
* make serialization of int8 models possible * make fixup * add docs * add ability to push to hub and save pretrained * fixes * more addition * more tests * fix issues * change variable * clearer message * adapt from suggestions * few fixes * remove unused function * Update src/transformers/utils/quantization_config.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * address last comments * last warning * clarify doc * protect import * Update src/transformers/modeling_utils.py * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
pioliverse authored
* resolve conflicts * rebase and make style * test * test * test * rebase and make style * rebase and make style * tests * tests * rewrite some functions * rebase and make style * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * fix some bugs & docstring * add models and tests * solve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * tests * resolve conflicts * resolve conflicts * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * fix some bugs & docstring * save resolution * make style * delete redefinition code * reformat function * reformat * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * tests * resolve conflicts * resolve conflicts * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * resolve conflicts * fix load_tf_weights_in_cpmant * reformat some unrelated files * upgrade quality * resolve conflicts * make style * fix bugs and refactor * modify docstrings and make style * unify import format in __init__.py * fix import-altclp bug * fix copies to update index.md * fix unused config parameters * fix unused config parameters * fix unused config parameters * update README_ja.md * dummy commit for unit test * fix attention mask * add CPMAntTokenizer&-Fast to auto-mapping * drop redundant changes in README_ko * fix defaults in docstring * fix use_cache and some docstring * add missing args in tokenizer * modify tester inheritance * add is_jieba_available * fix some bugs * make style and fix-copies * add doctests * skip integration tests * add is_jieba_available * fix bugs in common tests * adjust docstrings and make style * add argument docstring * adjust code to some specifications * make style and fix-copies * add fast tokenization test * dummy commit for unit test * dummy commit for unit test * dummy commit for unit test * normalize some comments and names * Bert->CPMAnt * camel names and drop redundant codes * make style and fix-coies * add CpmTokenizerFast _import_structure * drop cpmanttokenizerfast in model_doc * fix some problems * fix CPMAnt tokenization for common test * make style and fixup * fix copies and fixup * fix bugs in tokenization test * dummy commit for connection failure in unittest * fix copies * drop trailing comma * fix decorator in tests * dummy commit for connection failure in unittest --------- Co-authored-by:Gong Baitao <gongbaitao11@gmail.com>
-
Arthur authored
-
- 11 Apr, 2023 1 commit
-
-
Sylvain Gugger authored
* Make it easier to develop without a dev install * Remove ugly hack that doesn't work anyway
-
- 10 Apr, 2023 3 commits
-
-
Sugawara authored
* add GPTNeoXForSequenceClassification * move the labels to logits.device (ref: #22561) * fix
-
Kirill authored
-
Joel Lamy-Poirier authored
* Add model with cli tool * Remove unwanted stuff * Add new code * Remove inference runner * Style * Fix checks * Test updates * make fixup * fix docs * fix doc * fix test * hopefully fix pipeline tests * refactor * fix CIs * add comment * rename to `GPTBigCodeForCausalLM` * correct readme * make fixup + docs * make fixup * fixes * fixes * Remove pruning * Remove import * Doc updates * More pruning removal * Combine copies * Single MQA implementation, remove kv cache pre-allocation and padding * Update doc * Revert refactor to match gpt2 style * Merge back key and value caches, fix some type hints * Update doc * Fix position ids pith padding (PR 21080) * Add conversion script temporarily * Update conversion script * Remove checkpoint conversion * New model * Fix MQA test * Fix copies * try fix tests * FIX TEST!! * remove `DoubleHeadsModel` * add MQA tests * add slow tests * clean up * add CPU checker * final fixes * fixes - fix GPU issue - fixed slow tests - skip disk offload * fix final issue * Simplify and comment baddbmm fix * Remove unnecessary code * Transpose tweaks * Use beta=1 on cpu, improve tests --------- Co-authored-by:younesbelkada <younesbelkada@gmail.com>
-
- 07 Apr, 2023 3 commits
-
-
Joao Gante authored
add API warning
-
Wonhyeong Seo authored
docs: feat: Korean pipeline_tutorial Co-authored-by:
Jungnerd <46880056+jungnerd@users.noreply.github.com> Co-authored-by:
Hyeonseo Yun <0525_hhgus@naver.com> Co-authored-by:
gabrielwithappy <102908949+gabrielwithappy@users.noreply.github.com> Co-authored-by:
Na Yeon Han <nayeon2.han@gmail.com>
-
gabrielwithappy authored
translate the autoclass_tutorial and fix the typo of the quicktour
-
- 06 Apr, 2023 1 commit
-
-
Nicolas Patry authored
* Adding Llama FastTokenizer support. - Requires https://github.com/huggingface/tokenizers/pull/1183 version - Only support byte_fallback for llama, raise otherwise (safety net). - Lots of questions are special tokens How to test: ```python from transformers.convert_slow_tokenizer import convert_slow_tokenizer from transformers import AutoTokenizer from tokenizers import Tokenizer tokenizer = AutoTokenizer.from_pretrained("huggingface/llama-7b") if False: new_tokenizer = Tokenizer.from_file("tok.json") else: new_tokenizer = convert_slow_tokenizer(tokenizer) new_tokenizer.save("tok.json") strings = [ "This is a test", "生活的真谛是", "生活的真谛是[MASK]。", # XXX: This one is problematic because of special tokens # "<s> Something something", ] for string in strings: encoded = tokenizer(string)["input_ids"] encoded2 = new_tokenizer.encode(string).ids assert encoded == encoded2, f"{encoded} != {encoded2}" decoded = tokenizer.decode(encoded) decoded2 = new_tokenizer.decode(encoded2) assert decoded.strip() == decoded2, f"{repr(decoded)} != {repr(decoded2)}" ``` The converter + some test script. The test script. Tmp save. Adding Fast tokenizer + tests. Adding the tokenization tests. Correct combination. Small fix. Fixing tests. Fixing with latest update. Rebased. fix copies + normalized added tokens + copies. Adding doc. TMP. Doc + split files. Doc. Versions + try import. Fix Camembert + warnings -> Error. Fix by ArthurZucker. Not a decorator. * Fixing comments. * Adding more to docstring. * Doc rewriting.
-
- 05 Apr, 2023 2 commits
-
-
Younes Belkada authored
* add deplot + matcha on `transformers` * more docs * correct path * Update docs/source/en/model_doc/deplot.mdx Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix * use auto processor * Update docs/source/en/model_doc/matcha.mdx Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make fixup * Update docs/source/en/model_doc/deplot.mdx Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * add correct names --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com>
-
Wonhyeong Seo authored
Co-authored-by:gabrielwithappy <102908949+gabrielwithappy@users.noreply.github.com>
-
- 04 Apr, 2023 3 commits
-
-
Shubhamai authored
* initial commit * review changes * post model PR merge * updating doc
-
Matt authored
* Initial commit * more stash commit * Yet another stash commit * yet more stash commit * Mostly working except for docs / repo consistency * Stop importing model list from torch file * Add TF BLIP models to docs * Add auto classes * Move get_text_features and get_image_features * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/blip/test_modeling_tf_blip.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/blip/test_modeling_tf_blip.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/models/blip/test_modeling_tf_blip_text.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Use channels_last convolutions in TF (better performance + compatibility) * Remove _shape function * Move multi-line statement to one line in PT + TF * Specify tf.keras.layers instead of importing from it * Remove test_gradient_checkpointing and empty test_training methods * move some multi-line statements to one line * Update docstring for generate * Remove pruned heads set * Remove self.seq_len_dim * Fixed issues with loss computation, should resolve some tests. Also ensured that the PT version follows the config for output_attentions and output_hidden_states * ensure original model follows config in more cases * Skip the same cross-attention tests in the PT tests - didn't realize we did it twice! * Add training args throughout the models and layers * make fixup * Fix docstring for inputs_embeds * Add docstring for is_decoder * Add docstrings to text models * Remove redundant computation * Add unpack_inputs / keras_serializable * Add modeling_tf_blip to doctests * Add config classes for keras serialization * Changes to allow model porting with pt-to-tf * Quick fix to decoder head and test tweaks * Revert an issue with masking the embeddings outputs * Allow missing keys in some equivalence tests (for unused layers) * Add tf-pt equivalence tests back in * Update src/transformers/models/blip/modeling_tf_blip.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/blip/modeling_tf_blip_text.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make fixup * Refactor invert_attention_mask out into tf_utils * Re-enable cross-tests on the PT side too --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Arthur authored
* fix the prefix tokens * update fast and test values * add legacy behaviour Co-authored-by:
sgugger <sylvain.gugger@gmail.com> * update disclaimer, linkissue PR and behaviral changes * Apply suggestions from code review Co-authored-by:
Lysandre Debut <hi@lysand.re> * styling * make a quote * quote this time --------- Co-authored-by:
sgugger <sylvain.gugger@gmail.com> Co-authored-by:
Lysandre Debut <hi@lysand.re>
-
- 03 Apr, 2023 4 commits
-
-
Kirill authored
-
Joao Gante authored
* haha text go brrr (but in gradio)
-
Mohammed Jabir authored
* added biogpt token classifier * fix reviews * Updated modeling_biogpt.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
Jungnerd authored
docs: ko: sagemaker.mdx
-
- 30 Mar, 2023 2 commits
-
-
Manuel de Prada authored
Docs fix: Multinomial sampling decoding needs "num_beams=1", since by default it is usually not 1. (#22473) Fix: Multinomial sampling needs "num_beams=1", since by default is 5.
-
Joao Gante authored
* haha tokens go brrrr
-
- 28 Mar, 2023 1 commit
-
-
fpgaminer authored
Fix bug in perplexity guide calculations and update perplexity numbers.
-
- 27 Mar, 2023 2 commits
-
-
Arthur authored
* Initial commit * update modeling code * update doc * add functions necessary * fix impotrs * revert changes * fixup * more styling to get going * remove standalone encoder * update code * styling * fix config and model * update code and some refactoring * make more tests pass * Adding NLLB-200 - MoE - 54.5B for no language left behind Fixes #21300 * fix mor common tests * styke * update testing file * update * update * Router2 doc * update check config with sparse layer * add dummy router * update current conversion script * create on the fly conversion script * Fixup * style * style 2 * fix empty return * fix return * Update default config sparse layers * easier to create sparse layers * update * update conversion script * update modeling * add to toctree * styling * make ruff happy * update docstring * update conversion script * update, will break tests but impelemting top2 * update *
❗ local groups are supported here *⚠ ️ Support for local groups is now removed⚠ ️ This is because it has to work with model parallelism that we do not support * finish simplificaiton * Fix forward * style * fixup * Update modelling and test, refactoring * update tests * remove final layer)norm as it is done in the FF * routing works! Logits test added * nit in test * remove top1router * style * make sure sparse are tested. Had to change route_tokens a liottle bit * add support for unslip models when converting * fixup * style * update test s * update test * REFACTOR * encoder outputs match! * style * update testing *🎉 encoder and decoder logits match🎉 * styleing * update tests * cleanup tests * fix router test and CIs * cleanup * cleanup test styling * fix tests * Finally the generation tests match! * cleanup * update test * style testing file * remove script * cleanup * more cleanup * nits * update * NLLB tokenizer is wrong and will be fixed soon * use LongTensors * update tests * revert some small changes * fix second expert sampling and batch prioritized routing * update tests * finish last tests * make ruff happy * update * ruff again * style * Update docs/source/en/model_doc/nllb-moe.mdx Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Updates based on review * style and fix import issue * nit * more nits * cleanup * styling * update test_seconde_expert_policy * fix name * last nit on the markdown examples --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Nicola Procopio authored
* updated toctree * added and translated mdx documents
-
- 24 Mar, 2023 1 commit
-
-
Shubhamai authored
* [WIP] flax resnet * added pretrained flax models, results reproducible * Added pretrained flax models, results reproducible * working on tests * no real code change, just some comments * [flax] adding support for batch norm layers * fixing bugs related to pt+flax integration * removing loss from modeling flax output class * fixing classifier tests * fixing comments, model output * cleaning comments * review changes * review changes * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * renaming Flax to PyTorch --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-