• Sayak Paul's avatar
    [Research Projects] ORPO diffusion for alignment (#7423) · e29f16cf
    Sayak Paul authored
    
    
    * barebones orpo
    
    * remove reference model.
    
    * full implementation
    
    * change default of beta_orpo
    
    * add a training command.
    
    * fix: dataloading issues.
    
    * interpreting the formulation.
    
    * revert styling
    
    * add: wds full blown version
    
    * fix: per_gpu_batch_siz
    
    * start debuggin
    
    * debugging
    
    * remove print
    
    * fix
    
    * remove filter keys.
    
    * turn on non-blocking calls.
    
    * device_placement
    
    * let's see.
    
    * add bigger training run command
    
    * reinitialize generator for fair repro
    
    * add: detailed readme and requirements
    
    ---------
    Co-authored-by: default avatarSayak Paul <sayakpaul@Sayaks-MacBook-Pro-2.local>
    e29f16cf
README.md 4.23 KB