Commits · 4cdb67caba9abee8b5b94b4f88f55e95a3d1013f · chenpangpang / transformers

06 Dec, 2021 15 commits

Use cross_attention_hidden_size in Encoder-Decoder models (#14378) · 4cdb67ca

Yih-Dar authored Dec 07, 2021



* add cross_attention_hidden_size to text-2-text encoder-decoder models (PT/Flax)

* for TFEncoderDecoderModel

* add equivalence test for TFEncoderDecoderModel

* fix

* fix failed equivalence tests

* remove unused import

* add detailed comment

* Fix check_equivalence_tf_to_pt by using encoder/decoder

* cleaning

* Use cross_attention_hidden_size in speech-to-text

* clean fast init logging msg in encoder decoder models

* increase tol from 1e-5 to 1e-3 for tf test

* style

* style

* make sure projection layer can run

* remove type conversion + add check

* fix conflict (config.output_hidden_size)

* Remove TF -> PT in check_pt_tf_equivalence for TFEncoderDecoderModel
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

4cdb67ca

Remove nonworking workflow for now · 381b05a3
Sylvain Gugger authored Dec 06, 2021

381b05a3

fix flax examples tests (#14646) · 75ae287a

Suraj Patil authored Dec 07, 2021

* make tensorboard optional

* update test_fetcher for flax examples

* make the tests slow

75ae287a

Add a job to test the documentation build (#14645) · 03fda7b7
Sylvain Gugger authored Dec 06, 2021
```
* Add a job to the documentation build

* Add caching

* Test cache
```
03fda7b7
Fix syntax for class references (#14644) · e513c16e
Sylvain Gugger authored Dec 06, 2021

e513c16e

Auto processor fix (#14623) · e9688875

Lysandre Debut authored Dec 06, 2021



* Add AutoProcessor class
Init and tests
Add doc
Fix init
Update src/transformers/models/auto/processing_auto.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Reverts to tokenizer or feature extractor when available
Adapt test

* Revert "Adapt test"

This reverts commit bbdde5fab02465f24b54b227390073082cb32093.

* Revert "Reverts to tokenizer or feature extractor when available"

This reverts commit 77659ff5d21b6cc0baf6f443017e35e056a525bb.

* Don't revert everything Lysandre!
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>

e9688875

fix flax example tests (#14643) · cbe60265
Suraj Patil authored Dec 06, 2021

cbe60265

doc: mismatch between pooler/d_output (#14641) · df085d8e

guhur authored Dec 06, 2021

The model outputs a pooler_output whereas the doctype examples were using a pooled_output.

df085d8e

Add GPTJForQuestionAnswering (#14503) · 0f3f045e

tucan9389 authored Dec 07, 2021



* Add GPTJForQuestionAnswering

* Reformat for GPTJForQuestionAnswering

* Fix isort error

* make style for GPTJForQA

* Add _keys_to_ignore_on_load_missing

* Change the sequence of qa and classification
Co-authored-by: Suraj Patil <surajp815@gmail.com>

0f3f045e

Update the example of exporting Bart + BeamSearch to ONNX module to resolve comments. (#14310) · 1ccc033c

Jay Zhang authored Dec 06, 2021



* Update code to resolve comments left in previous PR.

* Add README.md file for this example.

* Update examples/onnx/pytorch/translation/README.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update examples/onnx/pytorch/translation/README.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update examples/onnx/pytorch/translation/README.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update README.md file to resolve comments.

* Add a section name.

* Update examples/onnx/pytorch/translation/README.md
Co-authored-by: Gary Miguel <garymm@garymm.org>

* Add more comments for _convert_past_list_to_tuple().

* Change the default file name to a consistent one.

* Fix a format issue.

* Update examples/onnx/pytorch/translation/README.md
Co-authored-by: Gary Miguel <garymm@garymm.org>

* Update examples/onnx/pytorch/translation/run_onnx_exporter.py
Co-authored-by: Gary Miguel <garymm@garymm.org>

* Update examples/onnx/pytorch/translation/README.md
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Change the folder to summarization and address some other coments.

* Update the torch version.
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Gary Miguel <garymm@garymm.org>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

1ccc033c

[urls to hub] Replace outdated model tags with their now-canonical pipeline types (#14617) · 6cdc3a78
Julien Chaumond authored Dec 06, 2021
```
* Replace outdated model tags with their now-canonical pipeline types

* spam the CI till it's green
```
6cdc3a78
add flax example tests in CI workflow (#14637) · c824d7ed
Suraj Patil authored Dec 06, 2021

c824d7ed
fix typo (#14635) · bc8a9f41
Suraj Patil authored Dec 06, 2021

bc8a9f41

Add Flax example tests (#14599) · c5bd732a

Suraj Patil authored Dec 06, 2021

* add test for glue

* add tests for clm

* fix clm test

* add summrization tests

* more tests

* fix few tests

* add test for t5 mlm

* fix t5 mlm test

* fix tests for multi device

* cleanup

* ci job

* fix metric file name

* make t5 more robust

c5bd732a

updated readme with proper arguments (#14624) · 803a8cd1
Kamal Raj authored Dec 06, 2021

803a8cd1

05 Dec, 2021 1 commit
- fix a typo (#14626) · 3977b584
  (Bill) Yuchen Lin authored Dec 04, 2021
  
  3977b584
03 Dec, 2021 6 commits

Make DefaultDataCollator importable from root (#14588) · 73ec4340

Matt authored Dec 03, 2021

* Make DefaultDataCollator importable from root

* Add documentation for DefaultDataCollator and add return_tensors argument to all class docstrings

* make style

* Add DefaultDataCollator to data_collator.rst

* Add DefaultDataCollator to data_collator.rst

73ec4340

[trainer] add tf32-mode control (#14606) · 71b1bf7e

Stas Bekman authored Dec 03, 2021



* [trainer] add --tf32 support

* it's pt>=.17

* it's pt>=.17

* flip the default to True

* add experimental note

* simplify logic

* style

* switch to 3-state logic

* doc

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* re-style code
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

71b1bf7e

Fix doc builder (#14616) · aada989a
Lysandre Debut authored Dec 03, 2021
```
* Fix doc builder

* Fix doc builder

* Fix doc builder
```
aada989a

2022 is the year of multi-modality (#14610) · ec47baeb

Lysandre Debut authored Dec 03, 2021



* 2022 is the year of multi-modality

* Small fix

* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* Apply suggestions from code review

* Apply to documentation index

* Apply suggestions from code review
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Update README.md
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Apply suggestions from code review

* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

ec47baeb

[CI] move env print to util, add pt, nccl versions (#14607) · e62091d5
Stas Bekman authored Dec 03, 2021
```
* move env print to util, add pt, nccl versions

* style

* version

* align
```
e62091d5

Improve tokenizer tests (#13594) · 66ea7391

Li-Huai (Allan) Lin authored Dec 03, 2021

* Use new method to acquire tokenizers

* Resolve TODOs.

* Style

* Fix

* Enable do_lower_case in test_tokenize_special_tokens

* Apply suggestion from code review

* Fix mask token handling

* Revert "Fix mask token handling"

This reverts commit daaa3f5291b1f71e5bc3604ca281c000000c4648.

* Fix FNet mask token tokenization

* Complete everything

* Apply suggestions from code review

66ea7391

02 Dec, 2021 8 commits

fix #14524 (IndexError when mask prob is too low) (#14525) · 6645eb61

Nik authored Dec 02, 2021

* fix #14524 (IndexError when mask prob is too low)

* fix formatting

* correct documentation, add option for setting min_num_masks

* change the semantic meaning of `mask_prob` in _compute_mask_indices

With this commit the meaing of `mask_prob` actually adhered to the probability for each
vector to be the start of a masked span of length.

* fix check_copies test

* fix documentation to semantic meaning of `upper bound of overall masking percentage`, revert changes to _compute_mask_indices

* fix typo

6645eb61

change tf.math.divide with int(/) to remove dim_per_head from the TF graph (#14600) · 96cc02b5
yis11178 authored Dec 02, 2021
```
Co-authored-by: yis <yis@graphcore.ai>
```
96cc02b5

Add CodeParrot 🦜 codebase (#14536) · 43f953cc

Leandro von Werra authored Dec 02, 2021



* add readme skeleton

* update readme

* add initialization script

* add deduplication script

* add codeparrot training script

* add code generation evaluation

* add validation loss script

* add requirements

* update readme

* tweak readme

* make style

* add highlights to readme

* add CLIs to scripts

* add tokenizer training script

* add docstring to constant length dataset

* fix defaults in arguments

* update readme with cli

* move image to hub

* tweaks of readme

* fix cli commands

* add author

* explain env variables

* fix formatting

* Update examples/research_projects/codeparrot/README.md
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Apply suggestions from code review
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* replace generic with gpt2 tokenizer
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

43f953cc

Python 3.6 -> Python 3.7 for TF runs (#14598) · e4c67d60
Lysandre Debut authored Dec 02, 2021

e4c67d60

[Flax] Add FlaxBlenderbotSmall (#14576) · 50d909be

Daniel Stancl authored Dec 02, 2021



* [WIP] Add FlaxBlenderbotSmall

* Revert some unintentionally changed files

Revert some unintentionally files changed by improperly filled cookiecutter instructions.

* Fix repo consistency

* Fix Flax-PT equivalence

* Apply suggestions from code review

* Update index.mdx

* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>

50d909be

Adds a git pull instruction to the documentation builder (#14597) · 77d87e73
Lysandre Debut authored Dec 02, 2021
```
* Adds a git pull instruction

* master -> main
```
77d87e73

Update doc img links (#14593) · 275402bf

Mishig Davaadorj authored Dec 02, 2021

* Update doc img links

* Rename toctree.yml -> _toctree.yml (#14594)

* Update doc img links

* Update performance.md img link

275402bf

Rename toctree.yml -> _toctree.yml (#14594) · 4f68de62
Mishig Davaadorj authored Dec 02, 2021

4f68de62

01 Dec, 2021 6 commits

[doc] bf16/tf32 guide (#14579) · fbe278c7

Stas Bekman authored Dec 01, 2021



* [doc] bf16/tf32 guide

* expand

* expand

* Update docs/source/performance.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

fbe278c7

Fix mask token handling (#14364) · 934e2799

Li-Huai (Allan) Lin authored Dec 02, 2021

* Fix mask token handling

* Revert "Fix mask token handling"

This reverts commit daaa3f5291b1f71e5bc3604ca281c000000c4648.

* Fix FNet mask token tokenization

934e2799

Doc new front (#14590) · 4df7d05a

Sylvain Gugger authored Dec 01, 2021



* Convert PretrainedConfig doc to Markdown

* Use syntax

* Add necessary doc files (#14496)

* Doc fixes (#14499)

* Fixes for the new front

* Convert DETR file for table

* Title is needed

* Simplify a bit

* Even simpler

* Remove imports

* Fix typo in toctree (#14516)

* Fix checkpoints badge

* Update versions.yml format (#14517)

* Doc new front github actions (#14512)

* Doc new front github actions

* Fix docstring

* Fix feature extraction utils import (#14515)

* Address Julien's comments

* Push to doc-builder

* Ready for merge

* Remove old build and deploy

* Doc misc fixes (#14583)

* Rm versions.yml from doc

* Fix converting.rst

* Rm pretrained_models from toctree

* Fix index links (#14567)

* Fix links in README

* Localized READMEs

* Fix copy script

* Fix find doc script

* Update README_ko.md
Co-authored-by: Julien Chaumond <julien@huggingface.co>

Co-authored-by: Julien Chaumond <julien@hugg...

4df7d05a

fix autocast for older pytorch · 14cc50d0
Stas Bekman authored Dec 01, 2021

14cc50d0

FlaxGPTJ (#14396) · 4c0dd199

Suraj Patil authored Dec 01, 2021

* add flax gptj

* no bias in attention dense

* no wpe

* fix rotary embeddings

* fix rotary embeds

* fix rotray embeds

* quality

* doc and quality

* fix equivalence tests

4c0dd199

WIP: Support for Training with BF16 (#13207) · 70996a54

Jamie DeAntonis authored Nov 30, 2021



* started bf16 integration

* minor changes

* code now runs

* style

* lay foundation for bf16 testing

* lay foundation for bf16 testing

* start the tests

* better bf16 check

* style

* 2 separate checkers - one for bf16 support, another for bf16+autocast

* Update src/transformers/training_args.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* a couple of comment resolutions

* more comment resolutions

* resolved a small bug

* just some print statemtns

* added todo marking

* added a todo

* adjust for API change s/fast_dtype/dtype/

* fix style

* merge 2 bf16 util functions

* bf16 now does scaling too

* Add support for bfloat16

* Revert T5 layernorm to float32

This is based on the comment at https://github.com/huggingface/transformers/pull/14448/files#r752660929 and the PyTorch PR https://github.com/pytorch/pytorch/pull/66920

 .

* Add comment about conversion to float32 before returning the numpy data

* Add comment about AMP-bfloat16 incompatibility

* Fix formatting

* typo

* reformer / bf16

* cleanup

* require at least pt-1.10

* fix

* will deal with deepspeed separately

* cleanup

* revert

* cleanup

* fp16_full_eval and bf16_full_eval are separate modes

* proper deprecation

* cleanup

* test and fixes

* spelling

* cleanup

* add a note that this API is experimental
Co-authored-by: jamie <jamie@cortx.com>
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: suriya <suriya@cortx.com>
Co-authored-by: Manuel R. Ciosici <manuelrciosici@gmail.com>

70996a54

30 Nov, 2021 4 commits

VisionTextDualEncoder (#13511) · fc1d97f2

Suraj Patil authored Nov 30, 2021



* init vision_text_dual_encoder

* fix merge

* remove extra heads

* fix tests

* remove VISION_TEXT_DUAL_ENCODER_PRETRAINED_CONFIG_ARCHIVE_MAP

* remove archive map

* fix imports

* fix more imports

* fix init

* delete tokenizers

* fix imports

* clean

* support clip's vision model

* handle None config

* begin tests

* more test and few fixes

* warn about newly init weights

* more tests

* add loss to model

* remove extra classes from doc

* add processor

* doc and small fixes

* add start docstr

* update flax model

* flax tests

* more flax tests

* doc

* quality

* doc and quality

* fix doc

* doc

* remove comments

* update warning

* quality

* fix docs

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* replace asserts, fix imports

* update imports

* fix import

* address some review comments

* fix check

* reduce tolerance

* fix test

* add flax integration test

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* address Sylvain's comments

* fix style

* add pt_flax_equivalence test in PT tests

* add pt integration test

* update test

* use pre-trained checkpoint in examples
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

fc1d97f2

use functional interface for softmax in attention (#14198) · 6ed9882d

Thomas Viehmann authored Nov 30, 2021

* use functional interface instead of instantiating module and immediately calling it

* fix torch.nn.functional to nn.functional. Thank you Stas!

6ed9882d

Add documentation for multi-label classification (#14168) · 4176bc16
giacomo snidero authored Nov 30, 2021
```
* "update example docstring multilabel example

* update example docstring multilabel example
```
4176bc16

[Flax] Add FlaxBlenderbot (#13633) · faacd747

Daniel Stancl authored Nov 30, 2021



* Init Flax implementation for Blenderbot

* Add a majority of stuff except for tests

* make style quality

* Add tests and fix some bugs

* Add tests

* Clean source code and fix some bugs

* Fix copies and docs

* Fix jax device condition for tests

* Fix layer norm in the encoder

* Fix a few typos in the test file

* make fix-copies

* make fix-copies

* fix layer norm

* Fix Flax params dtype (#13090)

* Fix PR reference (#13098)

* make fix-copies

* Update tests/test_modeling_flax_blenderbot.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

faacd747