1. 03 Jun, 2020 1 commit
    • Vasiliy Kuznetsov's avatar
      torchvision QAT tutorial: update for QAT with DDP (#2280) · 39021408
      Vasiliy Kuznetsov authored
      Summary:
      
      We've made two recent changes to QAT in PyTorch core:
      1. add support for SyncBatchNorm
      2. make eager mode QAT prepare scripts respect device affinity
      
      This PR updates the torchvision QAT reference script to take
      advantage of both of these.  This should be landed after
      https://github.com/pytorch/pytorch/pull/39337 (the last PT
      fix) to avoid compatibility issues.
      
      Test Plan:
      
      ```
      python -m torch.distributed.launch
        --nproc_per_node 8
        --use_env
        references/classification/train_quantization.py
        --data-path {imagenet1k_subset}
        --output-dir {tmp}
        --sync-bn
      ```
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      39021408
  2. 18 May, 2020 1 commit
    • Vasiliy Kuznetsov's avatar
      vision classification QAT tutorial: fix for DDP (redo) (#2230) · 7ed3950e
      Vasiliy Kuznetsov authored
      Summary:
      
      Redo of https://github.com/pytorch/vision/pull/2191
      
      Makes the classification QAT tutorial not crash when used
      with DDP. There were two issues:
      
      1. the model was moved to GPU before the observers were added, and they
      are created on CPU. In the context of this repo, the fix is to finalize
      the model before moving to GPU. We can potentially follow up with a
      better error message in the future, in a separate PR.
      2. the QAT conversion was running on the DDP'ed model, which had various
      problems. The fix is to unwrap the model from DDP before cloning it for
      evaluation.
      
      There is still work to do on verifying that BN is working correctly in
      QAT + DDP, but saving that for a separate PR.
      
      Test Plan:
      
      ```
      python -m torch.distributed.launch --use_env references/classification/train_quantization.py --data-path {path_to_imagenet_1k} --output_dir {output_dir}
      ```
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      7ed3950e
  3. 31 Mar, 2020 1 commit
  4. 20 Mar, 2020 1 commit
  5. 13 Mar, 2020 1 commit
  6. 10 Mar, 2020 1 commit
  7. 04 Nov, 2019 1 commit
  8. 30 Oct, 2019 1 commit
  9. 26 Oct, 2019 2 commits
    • raghuramank100's avatar
      Quantizable resnet and mobilenet models (#1471) · b4cb5765
      raghuramank100 authored
      * add quantized models
      
      * Modify mobilenet.py documentation and clean up comments
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      
      * Move fuse_model method to QuantizableInvertedResidual and clean up args documentation
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      
      * Restore relu settings to default in resnet.py
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      
      * Fix missing return in forward
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      
      * Fix missing return in forwards
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      
      * Change pretrained -> pretrained_float_models
      Replace InvertedResidual with block
      
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      
      * Update tests to follow similar structure to test_models.py, allowing for modular testing
      
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      
      * Replace forward method with simple function assignment
      
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      
      * Fix error in arguments for resnet18
      
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      
      * pretrained_float_model argument missing for mobilenet
      
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      
      * reference script for quantization aware training and post training quantization
      
      * reference script for quantization aware training and post training quantization
      
      * set pretrained_float_model as False and explicitly provide float model
      
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      
      * Address review comments:
      1. Replace forward with _forward
      2. Use pretrained models in reference train/eval script
      3. Modify test to skip if fbgemm is not supported
      
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      
      * Fix lint errors.
      Use _forward for common code between float and quantized models
      Clean up linting for reference train scripts
      Test over all quantizable models
      
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      
      * Update default values for args in quantization/train.py
      
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      
      * Update models to conform to new API with quantize argument
      Remove apex in training script, add post training quant as an option
      Add support for separate calibration data set.
      
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      
      * Fix minor errors in train_quantization.py
      
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      
      * Remove duplicate file
      
      * Bugfix
      
      * Minor improvements on the models
      
      * Expose print_freq to evaluate
      
      * Minor improvements on train_quantization.py
      
      * Ensure that quantized models are created and run on the specified backends
      Fix errors in test only mode
      
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      
      * Add model urls
      
      * Fix errors in quantized model tests.
      Speedup creation of random quantized model by removing histogram observers
      
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      
      * Move setting qengine prior to convert.
      
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      
      * Fix lint error
      
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      
      * Add readme.md
      
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      
      * Readme.md
      
      Summary:
      
      Test Plan:
      
      Reviewers:
      
      Subscribers:
      
      Tasks:
      
      Tags:
      
      * Fix lint
      b4cb5765
    • Francisco Massa's avatar
      [WIP] Add commands for model training (#1203) · 9e27356f
      Francisco Massa authored
      * Initial version of README for classification reference scripts
      
      * More context
      9e27356f
  10. 19 Jul, 2019 1 commit
    • Vinh Nguyen's avatar
      Fix apex distributed training (#1124) · c187c2b1
      Vinh Nguyen authored
      * adding mixed precision training with Apex
      
      * fix APEX default optimization level
      
      * adding python version check for apex
      
      * fix LINT errors and raise exceptions if apex not available
      
      * fixing apex distributed training
      
      * fix throughput calculation: include forward pass
      
      * remove torch.cuda.set_device(args.gpu) as it's already called in init_distributed_mode
      
      * fix linter: new line
      
      * move Apex initialization code back to the beginning of main
      
      * move apex initialization to before lr_scheduler - for peace of mind. Though, doing apex initialization after lr_scheduler seems to work fine as well
      c187c2b1
  11. 14 Jun, 2019 1 commit
    • LXYTSOS's avatar
      utils.py in references can't work with pytorch-cpu (#1023) · 250bac89
      LXYTSOS authored
      * can't work with pytorch-cpu fixed
      
      utils.py can't work with pytorch-cpu because of this line of code `memory=torch.cuda.max_memory_allocated()`
      
      * can't work with pytorch-cpu fixed
      
      utils.py can't work with pytorch-cpu because of this line of code 'memory=torch.cuda.max_memory_allocated()'
      250bac89
  12. 06 Jun, 2019 1 commit
  13. 21 May, 2019 1 commit
  14. 19 May, 2019 1 commit
  15. 08 May, 2019 1 commit
  16. 02 Apr, 2019 2 commits
  17. 28 Mar, 2019 1 commit
    • Francisco Massa's avatar
      Initial version of classification reference scripts (#819) · 27ff89f6
      Francisco Massa authored
      * Initial version of classification reference training script
      
      * Updates
      
      * Minor updates
      
      * Expose a few more options
      
      * Load optimizer and lr_scheduler when resuming
      
      Also log the learning rate
      
      * Evaluation-only and minor improvements
      
      Identified a bug in the reporting of the results. They need to be reduced between all processes
      
      * Address Soumith's comment
      
      * Fix some approximations on the evaluation metric
      
      * Flake8
      27ff89f6