"...git@developer.sourcefind.cn:chenpangpang/open-webui.git" did not exist on "4195af4942828c476084dfccf498a287713805b5"
Commit 0d81fc85 authored by Rémi Louf's avatar Rémi Louf
Browse files

specify in readme that both datasets are required

parent 19e99647
...@@ -395,13 +395,17 @@ This fine-tuned model is available as a checkpoint under the reference ...@@ -395,13 +395,17 @@ This fine-tuned model is available as a checkpoint under the reference
Based on the script [`run_seq2seq_finetuning.py`](https://github.com/huggingface/transformers/blob/master/examples/run_seq2seq_finetuning.py). Based on the script [`run_seq2seq_finetuning.py`](https://github.com/huggingface/transformers/blob/master/examples/run_seq2seq_finetuning.py).
Before running this script you should download **both** CNN and Daily Mail datasets (the links next to "Stories") from [Kyunghyun Cho's website](https://cs.nyu.edu/~kcho/DMQA/) in the same folder. Then uncompress the archives by running: Before running this script you should download **both** CNN and Daily Mail
datasets from [Kyunghyun Cho's website](https://cs.nyu.edu/~kcho/DMQA/) (the
links next to "Stories") in the same folder. Then uncompress the archives by running:
```bash ```bash
tar -xvf cnn_stories.tgz && tar -xvf dailymail_stories.tgz tar -xvf cnn_stories.tgz && tar -xvf dailymail_stories.tgz
``` ```
We will refer as `$DATA_PATH` the path to where you uncompressed both archive. note that the finetuning script **will not work** if you do not download both
datasets. We will refer as `$DATA_PATH` the path to where you uncompressed both
archive.
## Bert2Bert and abstractive summarization ## Bert2Bert and abstractive summarization
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment