• Francisco Massa's avatar
    concatenate small tensors into big ones to reduce the use of shared f… (#1795) · 9fc6522d
    Francisco Massa authored
    * concatenate small tensors into big ones to reduce the use of shared file descriptor (#1694)
    
    Summary:
    Pull Request resolved: https://github.com/pytorch/vision/pull/1694
    
    
    
    - PT dataloader forks worker process to speed up the fetching of dataset example.  The recommended way of multiprocess context is `forkserver` rather than `fork`.
    
    - Main process and worker processes will share the dataset class instance, which avoid duplicating the dataset and save memory. In this process, `ForkPickler(..).dumps(...)` will be called to serialize the objects, including objects within dataset instance recursively. `VideoClips` instance internally uses O(N) `torch.Tensor` to store per-video information, such as pts, and possible clips, where N is the No. of videos.
    
    - During dumping, each `torch.Tensor` will use one File Descriptor (FD). The OS default max limit of FD is 65K by using `ulimit -n` to query. The number of tensors in `VideoClips` often exceeds the limit.
    
    - To resolve this issue, we use a few big tensors by concatenating small tensors in the `__getstate__()` method, which will be called during pickling. This will only require O(1) tensors.
    
    - When this diff is landed, we can abondon D19173248
    
    In D19173397, in ClassyVision, we change the mp context from `fork` to `forkserver`, and finally can run the PT dataloader without hanging issues.
    
    Reviewed By: fmassa
    
    Differential Revision: D19179991
    
    fbshipit-source-id: c8716775c7c154aa33d93b25d112d2a59ea688a9
    
    * Try to fix Windows
    
    * Try fix Windows v2
    
    * Disable tests on Windows
    
    * Add back necessary part
    
    * Try fix OSX (and maybe Windows)
    
    * Fix
    
    * Try enabling Windows
    Co-authored-by: default avatarZhicheng Yan <zyan3@fb.com>
    9fc6522d
video_utils.py 15.4 KB