Commits · 843fdf2e420a95bdf93f5c7b5c43151fdaa98e48 · chenpangpang / transformers

20 Apr, 2023 1 commit
- [Examples/TensorFlow] minor refactoring to allow compatible datasets to work (#22879) · 4116d1ec
  Sayak Paul authored Apr 20, 2023
```
minor refactoring to allow compatible datasets to work.
```
  4116d1ec
17 Apr, 2023 2 commits
- Remove accelerate from tf test reqs (#22777) · cd3e0211
  Zachary Mueller authored Apr 17, 2023
```
Remove accelerate from tf
```
  cd3e0211
- Fix sneaky torch dependency in TF example (#22804) · 2237127a
  Matt authored Apr 17, 2023
  
  2237127a
14 Apr, 2023 1 commit

[Examples] TPU-based training of a language model using TensorFlow (#21657) · 390e121f

Sayak Paul authored Apr 14, 2023



* add: tokenizer training script for TF TPU LM training.

* add: script for preparing the TFRecord shards.

* add: sequence of execution to readme.

* remove limit from the tfrecord shard name.

* Add initial train_model.py

* Add basic training arguments and model init

* Get up to the point of writing the data collator

* Pushing progress so far!

* Complete first draft of model training code

* feat: grouping of texts efficiently.
Co-authored-by: Matt <rocketknight1@gmail.com>

* Add proper masking collator and get training loop working

* fix: things.

* Read sample counts from filenames

* Read sample counts from filenames

* Draft README

* Improve TPU warning

* Use distribute instead of distribute.experimental

* Apply suggestions from code review
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Modularize loading and add MLM probability as arg

* minor refactoring to better use the cli args.

* readme fillup.

* include tpu and inference sections in the readme.

* table of contents.

* parallelize maps.

* polish readme.

* change script name to run_mlm.py

* address PR feedback (round I).

---------
Co-authored-by: Matt <rocketknight1@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

390e121f

13 Apr, 2023 1 commit
- v4.29.0.dev0 · 888c4a2a
  Sylvain Gugger authored Apr 12, 2023
  
  888c4a2a
24 Mar, 2023 2 commits
- TensorFlow: pin maximum version to 2.12 (#22364) · 88dae78f
  Joao Gante authored Mar 24, 2023
  
  88dae78f
- Pin tensorflow-text to go with tensorflow (#22362) · 6587125c
  Sylvain Gugger authored Mar 24, 2023
```
* Pin tensorflow-text to go with tensorflow

* Make it more convenient to pin TensorFlow

* setup don't like f-strings
```
  6587125c
14 Mar, 2023 1 commit
- v4.28.0.dev0 · ebdb185b
  Sylvain Gugger authored Mar 14, 2023
  
  ebdb185b
07 Mar, 2023 1 commit
- Stop requiring Torch for our TF examples! (#21997) · d128f2ff
  Matt authored Mar 07, 2023
```
* Stop requiring Torch for our TF examples!

* Slight tweak to logging in the example itself
```
  d128f2ff
06 Mar, 2023 1 commit

Add TF contrastive image text finetuning example (#21939) · 5d8efc79

Matt authored Mar 06, 2023

* Initial commit

* stash commit

* Add model checkpointing and pushing

* Fix model name inference

* Update README

* Update README

* Remove a couple of Torch references

* Update copyright date

* make fixup

* Update PushToHubCallback args!

* Remove the torch summary

* Add strategy.scope

5d8efc79

01 Mar, 2023 1 commit
- Add check for different embedding types in examples (#21881) · 1d3a1cc4
  Matt authored Mar 01, 2023
```
* Add check for different embedding types in examples

* Correctly update summarization example
```
  1d3a1cc4
22 Feb, 2023 1 commit
- Apply ruff flake8-comprehensions (#21694) · 5e8c8eb5
  Aaron Gokaslan authored Feb 22, 2023
  
  5e8c8eb5
06 Feb, 2023 1 commit

Update quality tooling for formatting (#21480) · 6f79d264

Sylvain Gugger authored Feb 06, 2023

* Result of black 23.1

* Update target to Python 3.7

* Switch flake8 to ruff

* Configure isort

* Configure isort

* Apply isort with line limit

* Put the right black version

* adapt black in check copies

* Fix copies

6f79d264

01 Feb, 2023 1 commit

Add TF image classification example script (#19956) · e5db7051

amyeroberts authored Feb 01, 2023



* TF image classification script

* Update requirements

* Fix up

* Add tests

* Update test fetcher
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix directory path

* Adding `zero-shot-object-detection` pipeline doctest. (#20274)

* Adding `zero-shot-object-detection` pipeline doctest.

* Remove nested_simplify.

* Add generate kwargs to `AutomaticSpeechRecognitionPipeline` (#20952)

* Add generate kwargs to AutomaticSpeechRecognitionPipeline

* Add test for generation kwargs

* Trigger CI

* Data collator returns np

* Update feature extractor -> image processor

* Bug fixes - updates to reflect changes in API

* Update flags to match PT & run faster

* Update instructions - Maria's comment

* Update examples/tensorflow/image-classification/README.md

* Remove slow decorator

---------
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: bofeng huang <bofenghuang7@gmail.com>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

e5db7051

24 Jan, 2023 1 commit
- Use return_tensors="np" instead of "tf" (#21266) · 071529bd
  Matt authored Jan 24, 2023
```
Return NP instead of TF tensors for our data loading pipeline
```
  071529bd
23 Jan, 2023 1 commit
- v4.27.0.dev0 · 7119bb05
  Sylvain Gugger authored Jan 23, 2023
  
  7119bb05
05 Jan, 2023 1 commit

[NumPy] Remove references to deprecated NumPy type aliases (#21022) · 35a7052b

Roy Hvaara authored Jan 05, 2023



[NumPy] Remove references to deprecated NumPy type aliases.

This change replaces references to a number of deprecated NumPy type aliases (np.bool, np.int, np.float, np.complex, np.object, np.str) with their recommended replacement (bool, int, float, complex, object, str).

NumPy 1.24 drops the deprecated aliases, so we must remove uses before updating NumPy.
Co-authored-by: Peter Hawkins <phawkins@google.com>
Co-authored-by: Peter Hawkins <phawkins@google.com>

35a7052b

01 Dec, 2022 1 commit
- v4.26.0.dev0 · 60d1f31b
  Sylvain Gugger authored Dec 01, 2022
  
  60d1f31b
18 Nov, 2022 1 commit
- Pin to the right version... · a3f74580
  Sylvain Gugger authored Nov 18, 2022
  
  a3f74580
03 Nov, 2022 1 commit
- Only resize embeddings when necessary (#20043) · 06886d5a
  Sylvain Gugger authored Nov 03, 2022
```
* Only resize embeddings when necessary

* Add comment
```
  06886d5a
01 Nov, 2022 1 commit
- v4.25.0.dev0 · c3a93d8d
  Sylvain Gugger authored Oct 31, 2022
  
  c3a93d8d
10 Oct, 2022 1 commit
- Dev version · 10100979
  Lysandre authored Oct 10, 2022
  
  10100979
22 Sep, 2022 1 commit
- Reduce LR for TF MLM example test (#19156) · 83dc6377
  Matt authored Sep 22, 2022
  
  83dc6377
14 Sep, 2022 1 commit
- Dev version · 16913b3c
  Lysandre authored Sep 14, 2022
  
  16913b3c
10 Aug, 2022 1 commit

TF Examples Rewrite (#18451) · 6eb51450

Matt authored Aug 10, 2022



* Finished QA example

* Dodge a merge conflict

* Update text classification and LM examples

* Update NER example

* New Keras metrics WIP, fix NER example

* Update NER example

* Update MC, summarization and translation examples

* Add XLA warnings when shapes are variable

* Make sure batch_size is consistently scaled by num_replicas

* Add PushToHubCallback to all models

* Add docs links for KerasMetricCallback

* Add docs links for prepare_tf_dataset and jit_compile

* Correct inferred model names

* Don't assume the dataset has 'lang'

* Don't assume the dataset has 'lang'

* Write metrics in text classification

* Add 'framework' to TrainingArguments and TFTrainingArguments

* Export metrics in all examples and add tests

* Fix training args for Flax

* Update command line args for translation test

* make fixup

* Fix accidentally running other tests in fp16

* Remove do_train/do_eval from run_clm.py

* Remove do_train/do_eval from run_mlm.py

* Add tensorflow tests to circleci

* Fix circleci

* Update examples/tensorflow/language-modeling/run_mlm.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update examples/tensorflow/test_tensorflow_examples.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update examples/tensorflow/translation/run_translation.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update examples/tensorflow/token-classification/run_ner.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Fix save path for tests

* Fix some model card kwargs

* Explain the magical -1000

* Actually enable tests this time

* Skip text classification PR until we fix shape inference

* make fixup
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

6eb51450

06 Aug, 2022 2 commits

`transformers-cli login` => `huggingface-cli login` (#18490) · 9129fd03

Julien Chaumond authored Aug 06, 2022

* zero chance anyone's using that constant no?

* `transformers-cli login` => `huggingface-cli login`

* `transformers-cli repo create` => `huggingface-cli repo create`

* `make style`

9129fd03

Just re-reading the whole doc every couple of months

😬

(#18489) · 8d1f9039

Julien Chaumond authored Aug 06, 2022

* Delete valohai.yaml

* NLP => ML

* typo

* website supports https

* datasets

* 60k + modalities

* unrelated link fixing for accelerate

* Ok those links were actually broken

* Fix link

* Make `AutoTokenizer` auto-link

* wording tweak

* add at least one non-nlp task

8d1f9039

01 Aug, 2022 1 commit
- Fix ROUGE add example check and update README (#18398) · 941d2331
  Sylvain Gugger authored Aug 01, 2022
```
* Fix ROUGE add example check and update README

* Stay consistent in values
```
  941d2331
29 Jul, 2022 1 commit

Replace `as_target` context managers by direct calls (#18325) · 986526a0

Sylvain Gugger authored Jul 29, 2022



* Preliminary work on tokenizers

* Quality + fix tests

* Treat processors

* Fix pad

* Remove all uses of  in tests, docs and examples

* Replace all as_target_tokenizer

* Fix tests

* Fix quality

* Update examples/flax/image-captioning/run_image_captioning_flax.py
Co-authored-by: amyeroberts <amy@huggingface.co>

* Style
Co-authored-by: amyeroberts <amy@huggingface.co>

986526a0

28 Jul, 2022 1 commit

Migrate metric to Evaluate library for tensorflow examples (#18327) · a2586795

Vijay S Kalmath authored Jul 28, 2022

* Migrate metric to Evaluate library in tf examples

Currently tensorflow examples use `load_metric` function from Datasets
library , commit migrates function call to `load` function to
Evaluate library.

Fix for #18306

* Migrate metric to Evaluate library in tf examples

Currently tensorflow examples use `load_metric` function from Datasets
library , commit migrates function call to `load` function to
Evaluate library.

Fix for #18306

* Migrate `metric` to Evaluate for all tf examples

Currently tensorflow examples use `load_metric` function from Datasets
library , commit migrates function call to `load` function to
Evaluate library.

a2586795

27 Jul, 2022 1 commit
- Dev version · c89a592e
  Lysandre authored Jul 27, 2022
  
  c89a592e
13 Jul, 2022 1 commit
- Add summarization name mapping for MultiNews (#18117) · fde22c75
  John Giorgi authored Jul 13, 2022
```
* Add summarization name mapping for MultiNews

* Add summarization name mapping for MultiNews
```
  fde22c75
16 Jun, 2022 1 commit
- v4.21.0.dev0 · 7c6ec195
  Sylvain Gugger authored Jun 16, 2022
  
  7c6ec195
07 Jun, 2022 1 commit

Add examples telemetry (#17552) · 3cab9027

Sylvain Gugger authored Jun 07, 2022

* Add examples telemetry

* Alternative approach

* Add to all other examples

* Add to templates as well

* Put framework separately

* Same for TensorFlow

3cab9027

12 May, 2022 2 commits
- Black preview (#17217) · afe5d42d
  Sylvain Gugger authored May 12, 2022
```
* Black preview

* Fixup too!

* Fix check copies

* Use the same version as the CI

* Bump black
```
  afe5d42d
- Dev version · 5294fa12
  Lysandre Debut authored May 12, 2022
  
  5294fa12
27 Apr, 2022 1 commit

Misc. fixes for Pytorch QA examples: (#16958) · c82e017a

Leonid Boytsov authored Apr 27, 2022

1. Fixes evaluation errors popping up when you train/eval on squad v2 (one was newly encountered and one that was previously reported Running SQuAD 1.0 sample command raises IndexError #15401 but not completely fixed).
2. Removes boolean arguments that don't use store_true. Please, don't use these: *ANY non-empty string is being converted to True in this case and this clearly is not the desired behavior (and it creates a LOT of confusion).
3. All no-trainer test scripts are now saving metric values in the same way (with the right prefix eval_), which is consistent with the trainer-based versions.
4. Adds forgotten model.eval() in the no-trainer versions. This improved some results, but not everything (see the discussion in the end). Please, see the F1 scores and the discussion below.

c82e017a

19 Apr, 2022 1 commit
- fix `rum_clm.py` seeking text column name twice (#16624) · b74a9553
  Wonjae Kim authored Apr 19, 2022
  
  b74a9553
06 Apr, 2022 1 commit
- Dev version · a180efe7
  Lysandre Debut authored Apr 06, 2022
  
  a180efe7
04 Apr, 2022 1 commit
- Add use_auth to load_datasets for private datasets to PT and TF examples (#16521) · 24a85cca
  Karim Foda authored Apr 04, 2022
```
* fix formatting and remove use_auth

* Add use_auth_token to Flax examples
```
  24a85cca