Commit 564c62c6 authored by kimiyoung's avatar kimiyoung
Browse files

add resources and training time

parent 85cc80ab
......@@ -62,6 +62,11 @@ pretrained_xl
- run the script: `bash sota/text8.sh`
#### 3. Resources Needed for SoTA Model Training
We used 32, 32, 64, and 512 TPU cores for training our best models on enwik8, text8, wt103, and lm1b respectively. The training time for each model ranges from 2 to 5 days.
## Train "Transformer-XL" from scratch with GPUs or TPUs
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment