1. 22 Dec, 2020 1 commit
  2. 07 Dec, 2020 1 commit
  3. 01 Dec, 2020 1 commit
    • Francisco Massa's avatar
      concatenate small tensors into big ones to reduce the use of shared f… (#1795) · 9fc6522d
      Francisco Massa authored
      * concatenate small tensors into big ones to reduce the use of shared file descriptor (#1694)
      
      Summary:
      Pull Request resolved: https://github.com/pytorch/vision/pull/1694
      
      
      
      - PT dataloader forks worker process to speed up the fetching of dataset example.  The recommended way of multiprocess context is `forkserver` rather than `fork`.
      
      - Main process and worker processes will share the dataset class instance, which avoid duplicating the dataset and save memory. In this process, `ForkPickler(..).dumps(...)` will be called to serialize the objects, including objects within dataset instance recursively. `VideoClips` instance internally uses O(N) `torch.Tensor` to store per-video information, such as pts, and possible clips, where N is the No. of videos.
      
      - During dumping, each `torch.Tensor` will use one File Descriptor (FD). The OS default max limit of FD is 65K by using `ulimit -n` to query. The number of tensors in `VideoClips` often exceeds the limit.
      
      - To resolve this issue, we use a few big tensors by concatenating small tensors in the `__getstate__()` method, which will be called during pickling. This will only require O(1) tensors.
      
      - When this diff is landed, we can abondon D19173248
      
      In D19173397, in ClassyVision, we change the mp context from `fork` to `forkserver`, and finally can run the PT dataloader without hanging issues.
      
      Reviewed By: fmassa
      
      Differential Revision: D19179991
      
      fbshipit-source-id: c8716775c7c154aa33d93b25d112d2a59ea688a9
      
      * Try to fix Windows
      
      * Try fix Windows v2
      
      * Disable tests on Windows
      
      * Add back necessary part
      
      * Try fix OSX (and maybe Windows)
      
      * Fix
      
      * Try enabling Windows
      Co-authored-by: default avatarZhicheng Yan <zyan3@fb.com>
      9fc6522d
  4. 26 Nov, 2020 1 commit
  5. 20 Nov, 2020 1 commit
  6. 06 Nov, 2020 1 commit
  7. 23 Oct, 2020 1 commit
  8. 12 Oct, 2020 1 commit
  9. 14 Sep, 2020 1 commit
  10. 09 Sep, 2020 1 commit
  11. 27 Aug, 2020 1 commit
    • Philip Meier's avatar
      Fix Places365 dataset (#2625) · 6f028212
      Philip Meier authored
      * fix images extraction
      
      * remove test split
      
      * fix tests
      
      * be less clever in test data generation
      
      * remove micro optimization
      
      * lint
      6f028212
  12. 25 Aug, 2020 2 commits
    • Philip Meier's avatar
      Places365 dataset (#2610) · fc69c225
      Philip Meier authored
      * initial draft
      
      * [dirty] progress
      
      * remove inheritance from ImageFolder
      
      * add tests
      
      * lint
      
      * fix type hints
      
      * align getitem with other datasets
      
      * remove unused import
      
      * add docstring
      
      * guard existing image folders from overwrite
      
      * add missing entry in docstring
      
      * make fixpath more legible
      
      * add Places365 to docs
      fc69c225
    • Philip Meier's avatar
      fix FashionMNIST docstring (#2614) · 01fb0df0
      Philip Meier authored
      01fb0df0
  13. 20 Aug, 2020 1 commit
    • Harsh Rangwani's avatar
      Only pull keys from db in lsun for faster cache. (#2544) · ea6b879e
      Harsh Rangwani authored
      * Only pull keys from db in lsun for faster cache.
      
      This pull request inhances the speed of the cache creation for lsun dataset. For the "kitchen_train" the speed was getting slow with cache creation taking more then two hours. This speeds up to cache creation in within minutes. The issue was pulling the large image values each time and dropping them.
      
      For more details on this please refer this issue https://github.com/jnwatson/py-lmdb/issues/195.
      
      * Fixed bug in lsun.py when loading multiple categories
      
      * Make linter happy
      ea6b879e
  14. 03 Aug, 2020 11 commits
  15. 31 Jul, 2020 9 commits
  16. 30 Jul, 2020 1 commit
  17. 03 Jul, 2020 1 commit
  18. 22 Jun, 2020 1 commit
  19. 18 May, 2020 1 commit
  20. 07 May, 2020 1 commit
    • Guillem Orellana Trullols's avatar
      Update ucf101.py (#2186) · 14af9de6
      Guillem Orellana Trullols authored
      Now the dataset is not working properly because of this line of code `indices = [i for i in range(len(video_list)) if video_list[i][len(self.root) + 1:] in selected_files]`. 
      Performing the `len(self.root) + 1` only make sense if there is no training / to root
      
      ```
      >>> root = 'data/ucf-101/videos'
      >>> video_path = 'data/ucf-101/videos/activity/video.avi'
      >>> video_path [len(root ):]
      '/activity/video.avi'
      >>> video_path [len(root ) + 1:]
      'activity/video.avi'
      ```
      
      Appending the root path also to the selected files is a simple solution and make the dataset works with and without a trailing slash.
      14af9de6
  21. 04 May, 2020 1 commit