Unverified Commit 297233fa authored by Sam Shleifer's avatar Sam Shleifer Committed by GitHub
Browse files

[s2s] Switch README urls to cdn (#7670)

parent a1ecc90d
...@@ -19,7 +19,7 @@ For `bertabs` instructions, see [`bertabs/README.md`](bertabs/README.md). ...@@ -19,7 +19,7 @@ For `bertabs` instructions, see [`bertabs/README.md`](bertabs/README.md).
#### XSUM: #### XSUM:
```bash ```bash
cd examples/seq2seq cd examples/seq2seq
wget https://s3.amazonaws.com/datasets.huggingface.co/summarization/xsum.tar.gz wget https://cdn-datasets.huggingface.co/summarization/xsum.tar.gz
tar -xzvf xsum.tar.gz tar -xzvf xsum.tar.gz
export XSUM_DIR=${PWD}/xsum export XSUM_DIR=${PWD}/xsum
``` ```
...@@ -29,7 +29,7 @@ To use your own data, copy that files format. Each article to be summarized is o ...@@ -29,7 +29,7 @@ To use your own data, copy that files format. Each article to be summarized is o
#### CNN/DailyMail #### CNN/DailyMail
```bash ```bash
cd examples/seq2seq cd examples/seq2seq
wget https://s3.amazonaws.com/datasets.huggingface.co/summarization/cnn_dm_v2.tgz wget https://cdn-datasets.huggingface.co/summarization/cnn_dm_v2.tgz
tar -xzvf cnn_dm_v2.tgz # empty lines removed tar -xzvf cnn_dm_v2.tgz # empty lines removed
mv cnn_cln cnn_dm mv cnn_cln cnn_dm
export CNN_DIR=${PWD}/cnn_dm export CNN_DIR=${PWD}/cnn_dm
...@@ -39,7 +39,7 @@ this should make a directory called `cnn_dm/` with 6 files. ...@@ -39,7 +39,7 @@ this should make a directory called `cnn_dm/` with 6 files.
#### WMT16 English-Romanian Translation Data: #### WMT16 English-Romanian Translation Data:
download with this command: download with this command:
```bash ```bash
wget https://s3.amazonaws.com/datasets.huggingface.co/translation/wmt_en_ro.tar.gz wget https://cdn-datasets.huggingface.co/translation/wmt_en_ro.tar.gz
tar -xzvf wmt_en_ro.tar.gz tar -xzvf wmt_en_ro.tar.gz
export ENRO_DIR=${PWD}/wmt_en_ro export ENRO_DIR=${PWD}/wmt_en_ro
``` ```
...@@ -47,7 +47,7 @@ this should make a directory called `wmt_en_ro/` with 6 files. ...@@ -47,7 +47,7 @@ this should make a directory called `wmt_en_ro/` with 6 files.
#### WMT English-German: #### WMT English-German:
```bash ```bash
wget https://s3.amazonaws.com/datasets.huggingface.co/translation/wmt_en_de.tgz wget https://cdn-datasets.huggingface.co/translation/wmt_en_de.tgz
tar -xzvf wmt_en_de.tgz tar -xzvf wmt_en_de.tgz
export DATA_DIR=${PWD}/wmt_en_de export DATA_DIR=${PWD}/wmt_en_de
``` ```
......
# Script for verifying that run_bart_sum can be invoked from its directory # Script for verifying that run_bart_sum can be invoked from its directory
# Get tiny dataset with cnn_dm format (4 examples for train, val, test) # Get tiny dataset with cnn_dm format (4 examples for train, val, test)
wget https://s3.amazonaws.com/datasets.huggingface.co/summarization/cnn_tiny.tgz wget https://cdn-datasets.huggingface.co/summarization/cnn_tiny.tgz
tar -xzvf cnn_tiny.tgz tar -xzvf cnn_tiny.tgz
rm cnn_tiny.tgz rm cnn_tiny.tgz
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment