• moto's avatar
    Improve lfilter speed (#564) · 27a0f765
    moto authored
    Before
    
    Total time: 13.7078
    
    ```
       722|    220501|      2.44247|  1.10769e-05| 17.82%|    for i_sample, o0 in enumerate(input_signal_windows.t()):
    (call)|         1|  6.36578e-05|  6.36578e-05|  0.00%|# /scratch/moto/pytorch/torch/tensor.py:460 __iter__
    (call)|    220500|      1.60566|  7.28191e-06| 11.71%|# /scratch/moto/pytorch/torch/tensor.py:474 <lambda>
       723|    220500|      1.86697|  8.46698e-06| 13.62%|        windowed_output_signal = padded_output_waveform[:, i_sample:(i_sample + n_order)]
       724|    220500|      1.94628|  8.82669e-06| 14.20%|        o0.addmv_(windowed_output_signal, a_coeffs_flipped, alpha=-1)
       725|    220500|         2.46|  1.11565e-05| 17.94%|        o0.div_(a_coeffs[0])
       726|         0|            0|            0|  0.00%|
       727|    220500|      3.37869|  1.53229e-05| 24.64%|        padded_output_waveform[:, i_sample + n_order - 1] = o0
    ```
    
    After
    
    Total time: 10.9667
    
    ```
       722|         1|   9.2268e-05|   9.2268e-05|  0.00%|    input_signal_windows.div_(a_coeffs[0])
       723|         1|  2.14577e-05|  2.14577e-05|  0.00%|    a_coeffs_flipped.div_(a_coeffs[0])
       724|    220501|      2.40216|  1.08941e-05| 21.90%|    for i_sample, o0 in enumerate(input_signal_windows.t()):
    (call)|         1|  5.84126e-05|  5.84126e-05|  0.00%|# /scratch/moto/pytorch/torch/tensor.py:460 __iter__
    (call)|    220500|      1.59821|   7.2481e-06| 14.57%|# /scratch/moto/pytorch/torch/tensor.py:474 <lambda>
       725|    220500|      1.82273|  8.26633e-06| 16.62%|        windowed_output_signal = padded_output_waveform[:, i_sample:(i_sample + n_order)]
       726|    220500|      1.84074|  8.34802e-06| 16.78%|        o0.addmv_(windowed_output_signal, a_coeffs_flipped, alpha=-1)
       727|    220500|       3.2952|  1.49442e-05| 30.05%|        padded_output_waveform[:, i_sample + n_order - 1] = o0
    ```
    27a0f765
functional.py 60.3 KB