@@ -8,13 +8,13 @@ allowing both adaptive width and depth using knowledge distillation.
...
@@ -8,13 +8,13 @@ allowing both adaptive width and depth using knowledge distillation.
* This code is modified based on the repository developed by Hugging Face: [Transformers v2.1.1](https://github.com/huggingface/transformers/tree/v2.1.1), and is released in [GitHub](https://github.com/huawei-noah/Pretrained-Language-Model/tree/master/DynaBERT).
* This code is modified based on the repository developed by Hugging Face: [Transformers v2.1.1](https://github.com/huggingface/transformers/tree/v2.1.1), and is released in [GitHub](https://github.com/huawei-noah/Pretrained-Language-Model/tree/master/DynaBERT).
### Reference
### Reference
Lu Hou, Zhiqi Huang, Lifeng Shang, Xin Jiang, Qun Liu.
@@ -8,13 +8,13 @@ allowing both adaptive width and depth using knowledge distillation.
...
@@ -8,13 +8,13 @@ allowing both adaptive width and depth using knowledge distillation.
* This code is modified based on the repository developed by Hugging Face: [Transformers v2.1.1](https://github.com/huggingface/transformers/tree/v2.1.1), and is released in [GitHub](https://github.com/huawei-noah/Pretrained-Language-Model/tree/master/DynaBERT).
* This code is modified based on the repository developed by Hugging Face: [Transformers v2.1.1](https://github.com/huggingface/transformers/tree/v2.1.1), and is released in [GitHub](https://github.com/huawei-noah/Pretrained-Language-Model/tree/master/DynaBERT).
### Reference
### Reference
Lu Hou, Zhiqi Huang, Lifeng Shang, Xin Jiang, Qun Liu.