Commits · f4a0d6ff867e8a82a33d7a653e7d45372a463271 · chenpangpang / transformers

20 May, 2021 1 commit

A cleaner and more scalable implementation of symbolic tracing (#11763) · f4a0d6ff

Michael Benayoun authored May 20, 2021



Cleaner and more scalable implementation of symbolic tracing with torch.fx, and provides support for new architectures:
- ALBERT
- DistilBERT
- MobileBERT
- MegatronBERT
- GPT2
- GPT Neo
Co-authored-by: Michael Benayoun <michael@huggingface.co>

f4a0d6ff

20 Apr, 2021 1 commit
- [GPTNeo] create local attention mask ones (#11335) · cfd2eaa8
  Suraj Patil authored Apr 20, 2021
```
* create local attention mask ones

* remove old method, address patricks comment
```
  cfd2eaa8
06 Apr, 2021 1 commit

[WIP] GPT Neo cleanup (#10985) · 2a8115f0

Suraj Patil authored Apr 06, 2021

* better names

* add attention mixin

* all slow tests in one class

* make helper methods static so we can test

* add local attention tests

* better names

* doc

* apply review suggestions

2a8115f0

30 Mar, 2021 2 commits

GPT Neo few fixes (#10968) · 83d38c9f
Suraj Patil authored Mar 30, 2021
```
* fix checkpoint names

* auto model

* fix doc
```
83d38c9f

GPT Neo (#10848) · 86026437

Suraj Patil authored Mar 30, 2021



* lets begin

* boom boom

* fix out proj in attn

* fix attention

* fix local attention

* add tokenizer

* fix imports

* autotokenizer

* fix checkpoint name

* cleanup

* more clean-up

* more cleanup

* output attentions

* fix attn mask creation

* fix imports

* config doc

* add tests

* add slow tests

* quality

* add conversion script

* copyright

* typo

* another bites the dust

* fix attention tests

* doc

* add embed init in convert function

* fix copies

* remove tokenizer

* enable caching

* address review comments

* improve config and create attn layer list internally

* more consistent naming

* init hf config from mesh-tf config json file

* remove neo tokenizer from doc

* handle attention_mask in local attn layer

* attn_layers => attention_layers

* add tokenizer_class in config

* fix docstring

* raise if len of attention_layers is not same as num_layers

* remove tokenizer_class from config

* more consistent naming

* fix doc

* fix checkpoint names

* fp16 compat

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

86026437