Unverified Commit 0a3b9733 authored by Zhiqi Huang's avatar Zhiqi Huang Committed by GitHub
Browse files

Add model_cards for DynaBERT (#8012)

* Update README.md

* Add dynabert_overview.png

* Update README.md

* Create README.md

* Add dynabert_overview.png

* Update README.md

* Update README.md

* Delete dynabert_overview.png

* Update README.md

* Delete dynabert_overview.png

* Update README.md
parent afa21504
## DynaBERT: Dynamic BERT with Adaptive Width and Depth
* DynaBERT can flexibly adjust the size and latency by selecting adaptive width and depth, and
the subnetworks of it have competitive performances as other similar-sized compressed models.
The training process of DynaBERT includes first training a width-adaptive BERT and then
allowing both adaptive width and depth using knowledge distillation.
* This code is modified based on the repository developed by Hugging Face: [Transformers v2.1.1](https://github.com/huggingface/transformers/tree/v2.1.1), and is released in [GitHub](https://github.com/huawei-noah/Pretrained-Language-Model/tree/master/DynaBERT).
### Reference
Lu Hou, Zhiqi Huang, Lifeng Shang, Xin Jiang, Qun Liu.
[DynaBERT: Dynamic BERT with Adaptive Width and Depth](https://arxiv.org/abs/2004.04037).
```
@inproceedings{hou2020dynabert,
title = {DynaBERT: Dynamic BERT with Adaptive Width and Depth},
author = {Lu Hou, Zhiqi Huang, Lifeng Shang, Xin Jiang, Qun Liu},
booktitle = {NeurIPS},
year = {2020}
}
```
# DynaBERT: Dynamic BERT with Adaptive Width and Depth
## DynaBERT: Dynamic BERT with Adaptive Width and Depth
* DynaBERT can flexibly adjust the size and latency by selecting adaptive width and depth, and
the subnetworks of it have competitive performances as other similar-sized compressed models.
The training process of DynaBERT includes first training a width-adaptive BERT and then
allowing both adaptive width and depth using knowledge distillation.
* This code is modified based on the repository developed by Hugging Face: [Transformers v2.1.1](https://github.com/huggingface/transformers/tree/v2.1.1)
* The results in the paper are produced by using single V100 GPU.
* This code is modified based on the repository developed by Hugging Face: [Transformers v2.1.1](https://github.com/huggingface/transformers/tree/v2.1.1), and is released in [GitHub](https://github.com/huawei-noah/Pretrained-Language-Model/tree/master/DynaBERT).
### Reference
Lu Hou, Zhiqi Huang, Lifeng Shang, Xin Jiang, Qun Liu.
[DynaBERT: Dynamic BERT with Adaptive Width and Depth](https://arxiv.org/abs/2004.04037).
```
@inproceedings{hou2020dynabert,
title = {DynaBERT: Dynamic BERT with Adaptive Width and Depth},
author = {Lu Hou, Zhiqi Huang, Lifeng Shang, Xin Jiang, Qun Liu},
booktitle = {NeurIPS},
year = {2020}
}
```
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment