Commits · abc400b06a8ab26cd438b6e9add3aad082ffc48f · chenpangpang / transformers

21 Jun, 2022 13 commits

Add final_layer_norm to OPT model (#17785) · abc400b0

Thomas Wang authored Jun 21, 2022



* Add final_layer_norm to OPT model

* Add JAX and TF version

* Fix Keras name

* Woops

* Allow for non breaking change

* Apply suggestions from code review

* add tests
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

abc400b0

Properly check for a TPU device (#17802) · 52404cba
Zachary Mueller authored Jun 21, 2022

52404cba
Fix test for BF16 detection (#17803) · ef23fae5
Sylvain Gugger authored Jun 21, 2022

ef23fae5

TF Sharded (#17713) · 7cced021

Arthur authored Jun 21, 2022



* initial commit

* update modeeling tf utils

* quality

* clean and update args

* update

* remove potential bug

* code quality

* update

* update max shard

* update tests for sharding from pretrained

* fix remaining test

* make style

* h5py if tf available

* update and fix test

* fix test

* style

* modified push to hub to support shard for TF

* quick fix

* update code

* merge branch main and style

* Apply suggestions from code review
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update based on reviews

* update doc

* update and style

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update based on reviews

* fix typo

* style
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

7cced021

Use 5e-5 For BigBird PT/Flax equivalence tests (#17780) · f47afefb

Yih-Dar authored Jun 21, 2022



* rename to check_pt_flax_outputs

* update check_pt_flax_outputs

* use 5e-5 for BigBird PT/Flax test
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

f47afefb

Prepare transformers for v0.8.0 huggingface-hub release (#17716) · 6a5272b2

Lysandre Debut authored Jun 21, 2022



* Prepare CI for v0.8.0

* pin hfh (revert before merge)

* Revert "pin hfh (revert before merge)"

This reverts commit a0103140e1c77b810ffcb735192968bc03be3e1f.

* Test rc3

* Test latest rc

* Unpin to the RC
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

6a5272b2

Fix forward reference imports in DeBERTa configs (#17800) · 7bc88c05
Sylvain Gugger authored Jun 21, 2022

7bc88c05

Fix Automatic Download of Pretrained Weights in DETR (#17712) · 27e90738

Anugunj Naman authored Jun 21, 2022



* added use_backbone_pretrained

* style fixes

* update

* Update detr.mdx

* Update detr.mdx

* Update detr.mdx

* update using doc py

* Update detr.mdx

* Update src/transformers/models/detr/configuration_detr.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

27e90738

[ViTMAE] Fix docstrings and variable names (#17710) · b681e12d

NielsRogge authored Jun 21, 2022



* Fix docstrings and variable names

* Rename x to something better

* Improve messages

* Fix docstrings and add test for greyscale images
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>

b681e12d

Add link to notebook (#17791) · 3fab17fc
NielsRogge authored Jun 21, 2022
```
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
```
3fab17fc

[CodeParrot] Near-deduplication with jaccard similarity (#17054) · da2bd2ae

Jia LI authored Jun 21, 2022



* deduplication draft

* update style

* update style test

* dummy test main

* rename modules

* rename functions

* return extremes in deduplicate_clusters

* update style

* cast str for gzip

* update doc string

* time processing

* use dataset map to compute minhash

* fill value for short token

* remove da map method

* update style

* use share object to multiprocess

* update style

* use f-string and minor fix
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com>

* update style

* use module parameters

* change ds_dedup to ds_filter

* save ds_dedup

* mv test to script tests

* make jaccard threshold a parameter of deduplicate_dataset

* update style

* add doc strings

* update style

* add doc string for DuplicationIndex

* save files into data dir

* update readme

* Update examples/research_projects/codeparrot/README.md
Co-authored-by: Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com>

* make near deduplication optional

* move near deduplication in README

* Update examples/research_projects/codeparrot/README.md
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* use f string
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com>

da2bd2ae

add onnx support for deberta and debertav2 (#17617) · eb16be41

mrbean authored Jun 21, 2022



* add onnx support for debertav2

* debertav2 -> deberta-v2 in onnx features file

* remove causal lm

* add deberta-v2-xlarge to onnx tests

* use self.type().dtype() in xsoftmax
Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>

* remove hack for deberta

* remove unused imports

* Update src/transformers/models/deberta_v2/configuration_deberta_v2.py
Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>

* use generate dummy inputs

* linter

* add imports

* add support for deberta v1 as well

* deberta does not support multiple choice

* Update src/transformers/models/deberta/configuration_deberta.py
Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>

* Update src/transformers/models/deberta_v2/configuration_deberta_v2.py
Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>

* one line ordered dict

* fire build
Co-authored-by: Jingya HUANG <44135271+JingyaHuang@users.noreply.github.com>

eb16be41

Add UL2 (just docs) (#17740) · 8fcbe275

Patrick von Platen authored Jun 21, 2022



* Add UL2
Co-authored-by: Daniel Hesslow <Daniel.Hesslow@gmail.com>

* Correct naming

* sort better

* up

* apply sylvains suggestion

8fcbe275

20 Jun, 2022 5 commits

Update modeling_longt5.py (#17777) · da27c4b3

Brad Jascob authored Jun 20, 2022

On line 180, `torch.tensor(-1.0, xxx)` gives the error "TypeError: 'float' object cannot be interpreted as an integer"
This is because the dtype here is `int64`. For `dtype=int64`, this needs to simply be `-1`.
This impacts the long-t5-tglogbal-x model. It does not impact the long-t5-local-x version which does not appear to call this line.

da27c4b3

Not use -1e4 as attn mask (#17306) · d3cb2888

Yih-Dar authored Jun 20, 2022



* Use torch.finfo(self.dtype).min

* for GPTNeoX

* for Albert

* For Splinter

* Update src/transformers/models/data2vec/modeling_data2vec_audio.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix -inf used in Bart-like models

* Fix a few remaining -inf

* more fix

* clean up

* For CLIP

* For FSMT

* clean up

* fix test

* Add dtype argument and use it for LayoutLMv3

* update FlaxLongT5Attention
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

d3cb2888

Fix cache for GPT-Neo-X (#17764) · fdb12080
Sylvain Gugger authored Jun 20, 2022
```
* Fix cache for GPT-Neo-X

* Add more tests
```
fdb12080
deprecate is_torch_bf16_available (#17738) · a2d34b7c
Stas Bekman authored Jun 20, 2022
```
* deprecate is_torch_bf16_available

* address suggestions
```
a2d34b7c
TF: BART compatible with XLA generation (#17479) · 132402d7
Joao Gante authored Jun 20, 2022
```
* Also propagate changes to blenderbot, blenderbot_small, marian, mbart, and pegasus
```
132402d7

18 Jun, 2022 2 commits

Attempt to change Push CI to workflow_run (#17753) · 6589e510

Yih-Dar authored Jun 18, 2022



* Use workflow_run event for push CI

* change to workflow_run

* Add comments
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

6589e510

Added translation of index.mdx to Portuguese Issue #16824 (#17565) · 0d92798b

Rafael Zimmer authored Jun 17, 2022



* Added translation of installation.mdx to Portuguese, as well
as default templates of _toctree.yml and _config.py

* [ build_documentation.yml ] - Updated doc_builder to build
documentation in Portuguese.
[ pipeline_tutorial.mdx ] - Created translation for the pipeline_tutorial.mdx.

* [ build_pr_documentation.yml ] - Added pt language to pr_documentation builder.

[ pipeline_tutorial.mdx ] - Grammar changes.

* [ accelerate.mdx ] - Translated to Portuguese the acceleration tutorial.

* [ multilingual.mdx ] - Added portuguese translation for multilingual tutorial.

[ training.mdx ] - Added portuguese translation for training tutorial.

* [ preprocessing.mdx ] - WIP

* Update _toctree.yml

* Adding Pré-processamento to _toctree.yml

* Update accelerate.mdx

* Nits and eliminate preprocessing file while it is ready

* [ index.mdx ] - Translated to Portuguese the index apresentation page.

* [ docs/source/pt ] - Updated _toctree.yml to match newest translations.

* Fix build_pr_documentation.yml

* Fix index nits

* nits in _toctree
Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

0d92798b

17 Jun, 2022 6 commits

Save huggingface checkpoint as artifact in mlflow callback (#17686) · 522a9ece

Swetha Mandava authored Jun 17, 2022



* Fix eval to compute rouge correctly for rouge_score

* styling

* moving sentence tokenization to utils from run_eval

* saving ckpt in mlflow

* use existing format of args

* fix documentation
Co-authored-by: Swetha Mandava <smandava@nvidia.com>

522a9ece

Migrate HFDeepSpeedConfig from trfrs to accelerate (#17623) · 21a77242

Sourab Mangrulkar authored Jun 17, 2022



* Migrate HFDeepSpeedConfig from trfrs to accelerate

* add `accelerate` to testing dep

* addressing comments

* addressing comments

Using `_shared_state` and avoiding object creation. This is necessary as `notebook_launcher` in `launcers.py` checks `len(AcceleratorState._shared_state)>0` to throw an error.

* resolving comments

1. Use simple API from accelerate to manage the deepspeed config integration
2. Update the related documentation

* reverting changes and addressing comments

* docstring correction

* addressing nits

* addressing nits

* addressing nits 3

* bumping up the accelerate version to 0.10.0

* resolving import

* update setup.py to include deepspeed dependencies

* Update dependency_versions_table.py

* fixing imports

* reverting changes to CI dependencies for "run_tests_pipelines_tf*" tests

These changes didn't help with resolving the failures and I believe this needs to be addressed in another PR.

* removing `accelerate` as hard dependency

Resolves issues related to CI Tests

* adding `accelerate` as dependency for building docs

resolves failure in Build PR Documentation test

* adding `accelerate` as dependency in "dev" to resolve doc build issue

* resolving comments

1. adding `accelerate` to extras["all"]
2. Including check for accelerate too before import HFDeepSpeedConfig from there
Co-Authored-By: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* resolving comments
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

21a77242

Bump notebook in /examples/research_projects/lxmert (#17743) · e44a569f

dependabot[bot] authored Jun 17, 2022

Bumps [notebook](http://jupyter.org

) from 6.4.10 to 6.4.12.

---
updated-dependencies:
- dependency-name: notebook
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

e44a569f

Bump notebook in /examples/research_projects/visual_bert (#17742) · 5089a2d4

dependabot[bot] authored Jun 17, 2022

Bumps [notebook](http://jupyter.org

) from 6.4.10 to 6.4.12.

---
updated-dependencies:
- dependency-name: notebook
  dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

5089a2d4

feat: add num_workers arg to DataLoader (#17751) · 2d7c1bb1
greg2451 authored Jun 17, 2022

2d7c1bb1

Enable PyTorch nightly build CI (#17335) · ca169dbd

Yih-Dar authored Jun 17, 2022



* nightly build pytorch CI

* fix working dir

* change time and event name
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

ca169dbd

16 Jun, 2022 5 commits
- Remove needless file · 3c7e56fb
  Sylvain Gugger authored Jun 16, 2022
  
  3c7e56fb
- v4.21.0.dev0 · 7c6ec195
  Sylvain Gugger authored Jun 16, 2022
  
  7c6ec195
- Refine Bf16 test for deepspeed (#17734) · 36d46479
  Sylvain Gugger authored Jun 16, 2022
```
* Refine BF16 check in CPU/GPU

* Fixes

* Renames
```
  36d46479
- Fix tf shared embedding (#17730) · f44e2c2b
  Arthur authored Jun 16, 2022
```
* fix the naming

* from pt in test for now

* make style

* slow test and removed from_pt
```
  f44e2c2b
- Fix mask token in the example (#17725) · 2eadb7e5
  Jiayi Pan authored Jun 16, 2022
```
VIsualBert uses bert-base-uncased tokenizer, therefore, instead of {mask}, the mask token should be [MASK]
```
  2eadb7e5
15 Jun, 2022 8 commits
- Sort the model doc Toc Alphabetically (#17723) · 3981ee86
  Sylvain Gugger authored Jun 15, 2022
  
  3981ee86
- normalize keys_to_ignore (#17722) · 66f89332
  Stas Bekman authored Jun 15, 2022
  
  66f89332
- CLI: Add flag to push TF weights directly into main (#17720) · c3c62b5d
  Joao Gante authored Jun 15, 2022
```
* Add flag to push weights directly into main
```
  c3c62b5d
- Update requirements.txt (#17719) · 6ebeeeef
  Jeff Rasley authored Jun 15, 2022
  
  6ebeeeef
- Revert "Change push CI to run on workflow_run event (#17692)" (#17717) · 50415b84
  Yih-Dar authored Jun 15, 2022
```
This reverts commit b76290f4.
```
  50415b84
- [Wav2Vec2Conformer] Official release (#17709) · 7f14839f
  Patrick von Platen authored Jun 15, 2022
```
* [Wav2Vec2Conformer] Official release

* remove from not-in-readme
```
  7f14839f
- Documentation: RemBERT fixes (#17641) · 242cc6e2
  Stefan Schweter authored Jun 15, 2022
```
* rembert: fix python codeblock

* rembert: use correct google/rembert checkpoint name in documentation

* rembert: use correct google/rembert checkpoint name in TF documentation
```
  242cc6e2
- Change push CI to run on workflow_run event (#17692) · b76290f4
  Yih-Dar authored Jun 15, 2022
```
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
  b76290f4
14 Jun, 2022 1 commit
- fix tolerance for a bloom slow test (#17634) · d453ea61
  Younes Belkada authored Jun 14, 2022
  
  d453ea61