Commits · 60b1d6b45b578efa543094bb39a1b104efea31ae · chenpangpang / transformers

15 Jun, 2021 5 commits
- Add course banner (#12157) · 60b1d6b4
  Sylvain Gugger authored Jun 15, 2021
```
* Add course banner

* Update course banner
```
  60b1d6b4
- Have dummy processors have a `from_pretrained` method (#12145) · d07b540a
  Lysandre Debut authored Jun 15, 2021
  
  d07b540a
- Use a released version of optax rather than installing from Git. (#12173) · 9b393240
  Avital Oliver authored Jun 15, 2021
```
Use a released version of optax rather than installing from Git
```
  9b393240
- [Flax generate] Add params to generate (#12171) · 9bc9e598
  Patrick von Platen authored Jun 15, 2021
```
* fix_torch_device_generate_test

* remove @

* add params as input

* finish
```
  9bc9e598
- Add video links to the documentation (#12162) · a55dc157
  Sylvain Gugger authored Jun 15, 2021
  
  a55dc157
14 Jun, 2021 19 commits

consistent nn. and nn.functional: part 5 docs (#12161) · 04028317
Stas Bekman authored Jun 14, 2021

04028317
[style] consistent nn. and nn.functional: part 4 `examples` (#12156) · 88e84186
Stas Bekman authored Jun 14, 2021
```
* consistent nn. and nn.functional: p4 examples

* restore
```
88e84186
[style] consistent nn. and nn.functional: part 3 `tests` (#12155) · 372ab9cd
Stas Bekman authored Jun 14, 2021
```
* consistent nn. and nn.functional: p3 templates

* restore
```
372ab9cd

Flax Big Bird (#11967) · d9c0d08f

Vasudev Gupta authored Jun 15, 2021



* add flax bert

* bert -> bigbird

* original_full ported

* add debugger

* init block sparse

* fix copies ; gelu_fast -> gelu_new

* block sparse port

* fix block sparse

* block sparse working

* all ckpts working

* fix-copies

* make quality

* init tests

* temporary fix for FlaxBigBirdForMultipleChoice

* skip test_attention_outputs

* fix

* gelu_fast -> gelu_new ; fix multiple choice model

* remove nsp

* fix sequence classifier

* fix

* make quality

* make fix-copies

* finish

* Delete debugger.ipynb

* Update src/transformers/models/big_bird/modeling_flax_big_bird.py

* make style

* finish

* bye bye jit flax tests
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

d9c0d08f

consistent nn. and nn.functional: p2 templates (#12153) · a156da9a
Stas Bekman authored Jun 14, 2021

a156da9a
[Flax] Fix flax pt equivalence tests (#12154) · 007be9e4
Patrick von Platen authored Jun 14, 2021
```
* fix_torch_device_generate_test

* remove @

* upload
```
007be9e4

Adding TFWav2Vec2Model (#11617) · d438eee0

Will Rice authored Jun 14, 2021



* [WIP] Add TFWav2Vec2Model

Work in progress for adding a tensorflow version of Wav2Vec2

* feedback changes

* small fix

* Test Feedback Round 1

* Add SpecAugment and CTC Loss

* correct spec augment mask creation

* docstring and correct copyright

* correct bugs

* remove bogus file

* finish tests correction

* del unnecessary layers

* Update src/transformers/models/wav2vec2/modeling_tf_wav2vec2.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make style

* correct final bug

* Feedback Changes
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

d438eee0

[style] consistent nn. and nn.functional (#12124) · 1ed2ebf6
Stas Bekman authored Jun 14, 2021
```
* consistent nn. and nn.functional

* fix glitch

* fix glitch #2
```
1ed2ebf6

[optim] implement AdafactorSchedule (#12123) · ff7c8168

Stas Bekman authored Jun 14, 2021



* implement AdafactorSchedule

* typo

* fix

* Update src/transformers/optimization.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

ff7c8168

fix error message (#12148) · fe357648
Suraj Patil authored Jun 14, 2021

fe357648

[lm examples] Replicate --config_overrides addition to other LM examples (#12135) · 9de62cfb

Kumar Abhishek authored Jun 14, 2021



* [lm examples] Replicate --config_overrides addition to other LM examples

* Removing no trainer files changes

* Update README
Co-authored-by: Kumar Abhishek <kabhishek@expedia.com>

9de62cfb

Use text_column_name variable instead of "text" (#12132) · cd7961b6

Nicholas Broad authored Jun 14, 2021



* Use text_column_name variable instead of "text"

`text_column_name` was already defined above where I made the changes and it was also used below where I made changes.

This is a very minor change. If a dataset does not use "text" as the column name, then the `tokenize_function` will now use whatever column is assigned to `text_column_name`. `text_column_name` is just the first column name if "text" is not a column name. It makes the function a little more robust, though I would assume that 90% + of datasets use "text" anyway.

* black formatting

* make style
Co-authored-by: Nicholas Broad <nicholas@nmbroad.com>

cd7961b6

Don't log anything before logging is setup in examples (#12121) · b8ab5413
Sylvain Gugger authored Jun 14, 2021
```
* Don't log anything before logging is setup in examples

* Last example
```
b8ab5413
[Flax] Add links to google colabs (#12146) · 7566fefa
Patrick von Platen authored Jun 14, 2021
```
* fix_torch_device_generate_test

* remove @

* add colab links
```
7566fefa

Feature to use the PreTrainedTokenizerFast class as a stand-alone tokenizer (#11810) · 476ba679

SaulLu authored Jun 14, 2021



* feature for tokenizer without slow/legacy version

* format

* modify common test

* add tests

* add PreTrainedTokenizerFast to AutoTokenizer

* format

* change tokenizer common test in order to be able to run test without a slow version

* update tokenizer fast test in order to use `rust_tokenizer_class` attribute instead of `tokenizer_class`

* add autokenizer test

* replace  `if self.tokenizer_class is not None` with ` if self.tokenizer_class is None`

* remove obsolete change in comment

* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/tokenization_utils_fast.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* change `get_main_tokenizer` into `get_tokenizers`

* clarify `get_tokenizers` method

* homogenize with `test_slow_tokenizer` and `test_rust_tokenizer`

* add `test_rust_tokenizer = False` to tokenizer which don't define a fast version

* `test_rust_tokenizer = False` for BertJapaneseTokenizer

* `test_rust_tokenizer = False` for BertJapaneseCharacterTokenizationTest
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

476ba679

FlaxBart (#11537) · 4a51b1dd

Daniel Stancl authored Jun 14, 2021



* Start working on FlaxBart

* Create modeling_flax_bart.py

* Write FlaxBartAttention

* Add FlaxBartEncoderLayer

* Add FlaxBartDecoderLayer and some typing

* Add helepr function for FlaxBart

* shift_tokens_right

* _make_causal_mask

* _expand_mask

* Add PositionalEmbedding and fix init_std naming

* Add FlaxBartPretrainedModel

* Add FlaxBartEncoder

* Add FlaxBartEncoder

* Add FlaxBartEncoder among modules to be imported

* YET WE CANNOT INITIALIZE THAT!! :(

* Make BartEncoder working

Change BartEncoder to instance of nn.Module so far

* Add FlaxBartDecoder

* Add FlaxBartModel

* TODO to make model run -> Prepapre model inputs

* Resolve padding

* Add FlaxBartModel

* Add FlaxBartModel into importable modules

* Remove FlaxBartEncoder and FlaxBartDecoder from importable modules

* make style; not properly working

* make style; make quality not pass due to some import I left

* Remove TODO for padding_idx in nn.Embed so far

* Add FlaxBartForConditionalGeneration

* Incorporate Flax model output classes, i.e. return_dict

* Add another models and incorporate use_cache arg

* Add FlaxBartForSequenceClassification and FlaxBartForQuestionAnswering

* Incorporate use_cache arg from PyTorch implementation

* Add all necessary Flax output utils

* Add FlaxBartForCausalLM; not working yet'

* Add minor improvements; still lacks some functionality

* Update docs, src and tests

* Add support of FlaxBart to docs/source

* Fix some bugs in FlaxBart souce code

* Add some neccessary tests for FlaxBart models - jit_compilation not passing

* Fix tests and add test_head_masking

* Fix tests for @jax.jit computation

* Add test_head_masking

* Migrate FlaxBart tests from jax.numpy to numpy

* Remove FlaxBartForCausalLM

* Clean repo

* fix bart model weight structure

* Fix FlaxBartForSequenceClassification

Slicing is not possible to use below jit, therefore, selecting sentence
representation from hidden_states must be changed.

* Allow FlaxBartForSequenceClassification for testing pt_flax equivalence

* Allow testing for FlaxBartForQA for pt_flax equivalence

* Add a comment to FlaxBartForSequenceClassification + change noise from 1e-3 to 1e-6

* remove past_key_values

* remove inputs_mebeds and make input_ids required

* add position ids

* re-write attention layer

* fix dataclass

* fix pos embeds and attention output

* fix pos embeds

* expose encode method

* expose decode method

* move docstring to top

* add cache for causal attn layer

* remove head masking for now

* s2s greedy search first pass

* boom boom

* fix typos

* fix greedy generate for bart

* use encoder, decoder layers instead of num_hidden_layers

* handle encoder_outputs

* cleanup

* simplify decoding

* more clean-up

* typos

* Change header + add {decoder_,}position_ids into 2 models

* add BartConfig

* fix existing tests

* add encode, decode methods

* Fix shift_tokens_right for JIT compilation + clarify one condition

* fix decode

* encoder => encode

* simplify generate

* add tests for encode and decode

* style

* add tests for cache

* fix equivalence tests

* sample generate now works with seq2seq

* generation tests

* initialize dense layers

* docstring and cleanup

* quality

* remove get/set input_embeddings

* address Patricks suggestions

* decode for every model, remove encoder_outputs from call

* update tests accordingly

* decode returns only decoder outputs and logits

* fix arguments

* doc encode, decode methods

* correct base_model_prefix

* fix test for seq classif model

* fix docs
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

4a51b1dd

add readme for flax clm (#12111) · d36fce82

Suraj Patil authored Jun 14, 2021



* add readme for flax clm

* use section link for tokenizer

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update metrics
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

d36fce82

Add mlm pretraining xla torch readme (#12011) · 16c0efca

Patrick von Platen authored Jun 14, 2021



* fix_torch_device_generate_test

* remove @

* upload

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* Update examples/flax/language-modeling/README.md

* add more info

* finish

* fix
Co-authored-by: Patrick von Platen <patrick@huggingface.co>

16c0efca

Fix megatron_gpt2 attention block's causal mask (#12007) · ecd6efe7

Guido Novati authored Jun 14, 2021



* Fix megatron_gpt2 attention block's causal mask.

* compatibility with checkpoints created with recent versions of Megatron-LM

* added integration test for the released Megatron-GPT2 model

* code style changes

* added option to megatron conversion script to read from config file
Co-authored-by: Guido Novati <gnovati@nvidia.com>

ecd6efe7

13 Jun, 2021 1 commit
- Fix t5 error message (#12136) · 783b0dd5
  Jonathan Chang authored Jun 13, 2021
```
* Fix t5 error message

* Fix again
```
  783b0dd5
11 Jun, 2021 3 commits

Add from_pretrained to dummy timm objects (#12097) · 3b1f5caf

Lysandre Debut authored Jun 11, 2021



* Add from_pretrained to dummy timm

* Fix at the source

* Update utils/check_dummies.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Missing pretrained dummies

* Style
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

3b1f5caf

Flax CLM script (#12023) · 15b498f3

Suraj Patil authored Jun 11, 2021

* first draft

* max_seq_length => block_size

* fix arg names

* fix typos

* fix loss calculation

* add max examples, fix  train eval steps, metrics

* optimizer mask

* fix perpelexity, metric logging

* fix logging

* data_collator = > data_loader

* refactor loss_fn

* support single GPU

* pass distributed to write_metric

* fix jitting

* fix single device training

* fix single device metrics

* close inner progress bars once finished

* add overwrite_cache arg

* ifx dataset caching issue

* add more logs

* few small fixes,

* address nicholas suggestions

* fix docstr

* address patricks suggestions

* make flake happy

* pass new new_dropout_rng to apply_gradients

* reset train metrics after every epoc

* remove distributed logis, small fixes

15b498f3

Fix head masking generate tests (#12110) · e47765d8
Patrick von Platen authored Jun 11, 2021
```
* fix_torch_device_generate_test

* remove @

* fix tests
```
e47765d8

10 Jun, 2021 10 commits

add relevant description to tqdm in examples (#11927) · d2753dcb
Bhavitvya Malik authored Jun 11, 2021
```
* add relevant `desc` in examples

* require_version datasets>=1.8.0
```
d2753dcb

Flax VisionTransformer (#11951) · 9a9314f6

Jayendra authored Jun 10, 2021



* adding vit for flax

* added test for Flax-vit and some bug-fixes

* overrided methods where variable changes were necessary for flax_vit test

* added FlaxViTForImageClassification for test

* Update src/transformers/models/vit/modeling_flax_vit.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* made changes suggested in PR

* Adding jax-vit models for autoimport

* swapping num_channels and height,width dimension

* fixing the docstring for torch-like inputs for VIT

* add model to main init

* add docs

* doc, fix-copies

* docstrings

* small test fixes

* fix docs

* fix docstr

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* style
Co-authored-by: jayendra <jayendra@infocusp.in>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

9a9314f6

Fix a condition in test_generate_with_head_masking (#11911) · 0eaeae2e

Daniel Stancl authored Jun 10, 2021

* Fix a condition in test_generate_with_head_masking

* Fix usage of head_mask in bigbirg_pegasus

* Fix head masking for speech2text

* Resolve copy mismatch + drop unwanted print statement

* Fix the condition

0eaeae2e

Appending label2id and id2label to models to ensure inference works properly (#12102) · bebbdd0f
Matt authored Jun 10, 2021

bebbdd0f
Minor style edits · 4cda08de
Matt authored Jun 10, 2021

4cda08de
Update README.md to cover the TF GLUE example. · 7f08dbd1
Matt authored Jun 10, 2021

7f08dbd1
Fix quality · d72e5a3a
Sylvain Gugger authored Jun 10, 2021

d72e5a3a

New TF GLUE example (#12028) · 73a53265

Matt authored Jun 10, 2021



* Pushing partially-complete new GLUE example

* First draft of the new TF GLUE example! Needs a little more testing to be sure but it's almost ready.

* Fix to the fit() call

* Bugfixes, making sure TPU and multi-GPU support is ready

* Remove logger line that depends on Pytorch

* Style pass

* Deleting old TF GLUE example

* Include label2id and id2label in the saved model config

* Don't clobber the existing model.config.label2id

* Style fixes

* Update examples/tensorflow/text-classification/run_glue.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

73a53265

CLIPFeatureExtractor should resize images with kept aspect ratio (#11994) · 9d2cee8b

Tobias Norlund authored Jun 10, 2021



* Resize with kept aspect ratio

* Fixed failed test

* Overload center_crop and resize methods instead

* resize should handle non-PIL images

* update slow test

* Tensor => tensor
Co-authored-by: patil-suraj <surajp815@gmail.com>

9d2cee8b

Add text_column_name and label_column_name to run_ner and run_ner_no_trainer args (#12083) · 472a8676
kumapo authored Jun 10, 2021
```
* Add text_column_name and label_column_name to run_ner args

* Minor fix: grouping for text and label column name
```
472a8676

09 Jun, 2021 2 commits
- [Wav2Vec2ForPretraining] Correct checkpoints wav2vec2 & fix tests (#12089) · bc6f51e5
  Patrick von Platen authored Jun 09, 2021
```
* fix_torch_device_generate_test

* remove @

* fix tests
```
  bc6f51e5
- rm require_version_examples (#12088) · 61e19198
  Stas Bekman authored Jun 09, 2021
  
  61e19198