Commit 0371621a authored by chenzk's avatar chenzk
Browse files

v1.0

parents
Pipeline #1989 canceled with stages
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
python prepare_datasets.py --index_file data/train_index.txt --input_data_dir data --data_split train --output_data_dir data --tiktoken_tokenizer_name "cl100k_base"
# python prepare_datasets.py --index_file data/test_index.txt --input_data_dir data --data_split test --output_data_dir data --tiktoken_tokenizer_name "cl100k_base"
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
from setuptools import setup
setup(name='allamo',
version='5.0.0',
author='Krzysztof (Chris) Ociepa',
packages=['allamo'],
description='Simple, hackable and fast implementation for training/finetuning medium-sized LLaMA-based models',
license='MIT',
install_requires=[
'torch',
'numpy',
'joblib',
'wandb'
],
)
from allamo.configuration import AllamoConfiguration
from allamo.trainer.simple_trainer import SimpleTrainer
if __name__ == '__main__':
config = AllamoConfiguration()
trainer = SimpleTrainer(config)
trainer.init_wandb()
trainer.train()
trainer.close()
# Refer to allamo/configuration.py, if you need to sft:"training_type": "sft", "init_from": "resume", ...
python train.py --config="./train_configs/train_1B.json"
This diff is collapsed.
torchrun --standalone --nnodes=1 --nproc-per-node=8 train.py --config="./train_configs/train_1B.json"
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment