• Jun Ru Anderson's avatar
    [test] specify chunks for pipe/transformer benchmark (#52) · d1d74413
    Jun Ru Anderson authored
    
    
    * specify chunks for pipe/transformer benchmark
    
    Set chunks to be equal to len(balance) for pipe/transformer benchmark. Will update words per second and memory usage checks in next commit (must test on CircleCI to find appropriate values)
    
    * change benchmark words per second and memory usage
    
    Did six runs for words-per-second, with results: 9144.40, 9163.91, 9993.01, 9082.82, 9155.09, 9000.67
    Peak allocated bytes per device (which does not change between runs) were 193206272, 645632, 562688, 92688384 for devices 0, 1, 2 and 3, respectively
    
    * increase batch size
    
    batch size was small enough that the GPU's computing power was not the bottleneck, slowing training and specifically making more chunks slower. Increasing batch size has therefore increased training speed
    
    * update benchmark numbers
    
    ran six times, with wps 36917.44, 36797.65, 37006.03, 36872.84, 37129.31, 37003.31 and peak allocated bytes 4061909504, 4050944, 10427392, 2031824896 for devices 0,1,2 and 3 respectively.
    Co-authored-by: default avatarJun Ru Anderson <andersonic@fb.com>
    d1d74413
transformer.py 9.34 KB