@@ -14,7 +14,6 @@ The main variables in the code should be set as follows:
...
@@ -14,7 +14,6 @@ The main variables in the code should be set as follows:
| `--output_path` | The path where to store the preprocessed dataset, one .idx file and one .bin file would be created for each dataset. |
| `--output_path` | The path where to store the preprocessed dataset, one .idx file and one .bin file would be created for each dataset. |
## Dataset
## Dataset
Samples in dataset should be seperated with '\n', and within each sample, the '\n' should be replaced with '\<n>', therefore each line in the dataset is a single sample. And the program would replace the '\<n>' back to '\n' during preprocessing. 
Samples in dataset should be seperated with '\n', and within each sample, the '\n' should be replaced with '\<n>', therefore each line in the dataset is a single sample. And the program would replace the '\<n>' back to '\n' during preprocessing.