1. 15 Jan, 2021 1 commit
  2. 11 Jan, 2021 1 commit
    • Josh Bradley's avatar
      Add widerface dataset (#2883) · d0063f3d
      Josh Bradley authored
      
      
      * initial commit of widerface dataset
      
      * comment out old code
      
      * improve parsing of annotation files
      
      * code cleanup and fix docstring comments
      
      * speed up check for quota exceeded
      
      * cleanup print statements
      
      * reformat code and remove print statements
      
      * minor code cleanup and reformatting
      
      * add more comments
      
      * reuse variable
      
      * reverse formatting changes
      
      * fix flake8 errors
      
      * add type annotations
      
      * fix mypy errors
      
      * add a base_folder to root directory
      
      * some formatting fixes
      
      * GDrive threshold does not throw 403 error
      
      * testing new download logic
      
      * cleanup logic for download and integrity check
      
      * use a better variable name
      
      * format fix
      
      * reorder list in docstring
      
      * initial widerface unit test - fails on MD5 check
      
      * use list of dictionaries to store dataset
      
      * fix docstring formatting
      
      * remove unnecessary error checking
      
      * fix type checker error
      
      * revert typo fix
      
      * rename var constants, use file context manager, verify str args
      
      * fix flake8 error
      
      * fix checking target_type argument values
      
      * create uncompressed dataset folders
      
      * cleanup unit tests for widerface
      
      * use correct os function
      
      * add more info to docstring
      
      * disable unittests for windows
      
      * fix _check_integrity logic
      
      * update docstring
      
      * remove citation
      
      * remove target_type option
      
      * fix formatting issue
      Co-authored-by: default avatarPhilip Meier <github.pmeier@posteo.de>
      
      * remove comment and add more info to docstring
      
      * update type annotations
      
      * restart CI jobs
      Co-authored-by: default avatarJoshua Bradley <jgbrad3@evoforge.org>
      Co-authored-by: default avatarPhilip Meier <github.pmeier@posteo.de>
      Co-authored-by: default avatarvfdev <vfdev.5@gmail.com>
      d0063f3d
  3. 07 Jan, 2021 1 commit
  4. 22 Dec, 2020 1 commit
  5. 07 Dec, 2020 1 commit
  6. 01 Dec, 2020 1 commit
    • Francisco Massa's avatar
      concatenate small tensors into big ones to reduce the use of shared f… (#1795) · 9fc6522d
      Francisco Massa authored
      * concatenate small tensors into big ones to reduce the use of shared file descriptor (#1694)
      
      Summary:
      Pull Request resolved: https://github.com/pytorch/vision/pull/1694
      
      
      
      - PT dataloader forks worker process to speed up the fetching of dataset example.  The recommended way of multiprocess context is `forkserver` rather than `fork`.
      
      - Main process and worker processes will share the dataset class instance, which avoid duplicating the dataset and save memory. In this process, `ForkPickler(..).dumps(...)` will be called to serialize the objects, including objects within dataset instance recursively. `VideoClips` instance internally uses O(N) `torch.Tensor` to store per-video information, such as pts, and possible clips, where N is the No. of videos.
      
      - During dumping, each `torch.Tensor` will use one File Descriptor (FD). The OS default max limit of FD is 65K by using `ulimit -n` to query. The number of tensors in `VideoClips` often exceeds the limit.
      
      - To resolve this issue, we use a few big tensors by concatenating small tensors in the `__getstate__()` method, which will be called during pickling. This will only require O(1) tensors.
      
      - When this diff is landed, we can abondon D19173248
      
      In D19173397, in ClassyVision, we change the mp context from `fork` to `forkserver`, and finally can run the PT dataloader without hanging issues.
      
      Reviewed By: fmassa
      
      Differential Revision: D19179991
      
      fbshipit-source-id: c8716775c7c154aa33d93b25d112d2a59ea688a9
      
      * Try to fix Windows
      
      * Try fix Windows v2
      
      * Disable tests on Windows
      
      * Add back necessary part
      
      * Try fix OSX (and maybe Windows)
      
      * Fix
      
      * Try enabling Windows
      Co-authored-by: default avatarZhicheng Yan <zyan3@fb.com>
      9fc6522d
  7. 26 Nov, 2020 1 commit
  8. 20 Nov, 2020 1 commit
  9. 06 Nov, 2020 1 commit
  10. 23 Oct, 2020 1 commit
  11. 12 Oct, 2020 1 commit
  12. 14 Sep, 2020 1 commit
  13. 09 Sep, 2020 1 commit
  14. 27 Aug, 2020 1 commit
    • Philip Meier's avatar
      Fix Places365 dataset (#2625) · 6f028212
      Philip Meier authored
      * fix images extraction
      
      * remove test split
      
      * fix tests
      
      * be less clever in test data generation
      
      * remove micro optimization
      
      * lint
      6f028212
  15. 25 Aug, 2020 2 commits
    • Philip Meier's avatar
      Places365 dataset (#2610) · fc69c225
      Philip Meier authored
      * initial draft
      
      * [dirty] progress
      
      * remove inheritance from ImageFolder
      
      * add tests
      
      * lint
      
      * fix type hints
      
      * align getitem with other datasets
      
      * remove unused import
      
      * add docstring
      
      * guard existing image folders from overwrite
      
      * add missing entry in docstring
      
      * make fixpath more legible
      
      * add Places365 to docs
      fc69c225
    • Philip Meier's avatar
      fix FashionMNIST docstring (#2614) · 01fb0df0
      Philip Meier authored
      01fb0df0
  16. 20 Aug, 2020 1 commit
    • Harsh Rangwani's avatar
      Only pull keys from db in lsun for faster cache. (#2544) · ea6b879e
      Harsh Rangwani authored
      * Only pull keys from db in lsun for faster cache.
      
      This pull request inhances the speed of the cache creation for lsun dataset. For the "kitchen_train" the speed was getting slow with cache creation taking more then two hours. This speeds up to cache creation in within minutes. The issue was pulling the large image values each time and dropping them.
      
      For more details on this please refer this issue https://github.com/jnwatson/py-lmdb/issues/195.
      
      * Fixed bug in lsun.py when loading multiple categories
      
      * Make linter happy
      ea6b879e
  17. 03 Aug, 2020 11 commits
  18. 31 Jul, 2020 9 commits
  19. 30 Jul, 2020 1 commit
  20. 03 Jul, 2020 1 commit
  21. 22 Jun, 2020 1 commit