Unverified Commit fa6113f9 authored by Soham Chatterjee's avatar Soham Chatterjee Committed by GitHub
Browse files

Fixed spelling of training (#4416)

parent 757baee8
...@@ -6,7 +6,7 @@ Overview ...@@ -6,7 +6,7 @@ Overview
The ALBERT model was proposed in `ALBERT: A Lite BERT for Self-supervised Learning of Language Representations <https://arxiv.org/abs/1909.11942>`_ The ALBERT model was proposed in `ALBERT: A Lite BERT for Self-supervised Learning of Language Representations <https://arxiv.org/abs/1909.11942>`_
by Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut. It presents by Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut. It presents
two parameter-reduction techniques to lower memory consumption and increase the trainig speed of BERT: two parameter-reduction techniques to lower memory consumption and increase the training speed of BERT:
- Splitting the embedding matrix into two smaller matrices - Splitting the embedding matrix into two smaller matrices
- Using repeating layers split among groups - Using repeating layers split among groups
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment