• Xiaoyu Zhang's avatar
    add rwkv-5-3b model (#666) · 82a533a6
    Xiaoyu Zhang authored
    * support rwkv5-3b learnboard
    
    * update rwkv-5-3b config
    
    * update config
    
    * refine
    
    * fix bug
    
    * update config
    
    * refine
    
    * reduce batch size
    
    * refine
    
    * reduce batch size to avoid oom in special datasets
    
    * Update huggingface.py
    
    * Update huggingface.py
    82a533a6
eval_rwkv5_3b.py 219 Bytes