• Taylor Robie's avatar
    Reorder NCF data pipeline (#5536) · 19d4eaaf
    Taylor Robie authored
    * intermediate commit
    
    finish replacing spillover with resampled padding
    
    intermediate commit
    
    * resolve merge conflict
    
    * intermediate commit
    
    * further consolidate the data pipeline
    
    * complete first pass at data pipeline refactor
    
    * remove some leftover code
    
    * fix test
    
    * remove resampling, and move train padding logic into neumf.py
    
    * small tweaks
    
    * fix weight bug
    
    * address PR comments
    
    * fix dict zip. (Reed led me astray)
    
    * delint
    
    * make data test deterministic and delint
    
    * Reed didn't lead me astray. I just can't read.
    
    * more delinting
    
    * even more delinting
    
    * use resampling for last batch padding
    
    * pad last batch with unique data
    
    * Revert "pad last batch with unique data"
    
    This reverts commit cbdf46efcd5c7907038a24105b88d38e7f1d6da2.
    
    * move padded batch to the beginning
    
    * delint
    
    * fix step check for synthetic data
    19d4eaaf
data_preprocessing.py 26.5 KB