"docker/requirements.txt" did not exist on "71b6b09e269af1ed3afdf119381ebf5214557e3f"
prepare_datasets.sh 330 Bytes
Newer Older
chenzk's avatar
v1.0  
chenzk committed
1
2
python prepare_datasets.py --index_file data/train_index.txt --input_data_dir data --data_split train --output_data_dir data --tiktoken_tokenizer_name "cl100k_base"
# python prepare_datasets.py --index_file data/test_index.txt --input_data_dir data --data_split test --output_data_dir data --tiktoken_tokenizer_name "cl100k_base"