1. 17 May, 2022 4 commits
  2. 16 May, 2022 2 commits
  3. 15 May, 2022 1 commit
    • John Reese's avatar
      apply import merging for fbcode (7 of 11) · b3a9204c
      John Reese authored
      Summary:
      Applies new import merging and sorting from µsort v1.0.
      
      When merging imports, µsort will make a best-effort to move associated
      comments to match merged elements, but there are known limitations due to
      the diynamic nature of Python and developer tooling. These changes should
      not produce any dangerous runtime changes, but may require touch-ups to
      satisfy linters and other tooling.
      
      Note that µsort uses case-insensitive, lexicographical sorting, which
      results in a different ordering compared to isort. This provides a more
      consistent sorting order, matching the case-insensitive order used when
      sorting import statements by module name, and ensures that "frog", "FROG",
      and "Frog" always sort next to each other.
      
      For details on µsort's sorting and merging semantics, see the user guide:
      https://usort.readthedocs.io/en/stable/guide.html#sorting
      
      Reviewed By: lisroach
      
      Differential Revision: D36402205
      
      fbshipit-source-id: a4efc688d02da80c6e96685aa8eb00411615a366
      b3a9204c
  4. 14 May, 2022 2 commits
  5. 12 May, 2022 1 commit
    • John Reese's avatar
      formatting changes from black 22.3.0 · e1623106
      John Reese authored
      Summary:
      Applies the black-fbsource codemod with the new build of pyfmt.
      
      paintitblack
      
      Reviewed By: lisroach
      
      Differential Revision: D36324783
      
      fbshipit-source-id: 280c09e88257e5e569ab729691165d8dedd767bc
      e1623106
  6. 11 May, 2022 1 commit
  7. 10 May, 2022 1 commit
    • Tong Xiao's avatar
      Fix a bug in export api that prevents setting specific kwargs for different backends · 70f236a6
      Tong Xiao authored
      Summary:
      Pull Request resolved: https://github.com/facebookresearch/d2go/pull/236
      
      When exporting the model to different backend engines, users may set the `model_export_kwargs` for different backends.
      
      The torchscript backend needs a placeholder `**export_kwargs` to allow the kwargs for other backends.
      
      Frankly speaking, this mechanism of passing the same set of kwargs to different backends is confusing. Better to be refactored to the factory pattern with isolated kwargs.
      
      Reviewed By: HarounH, wat3rBro
      
      Differential Revision: D36140771
      
      fbshipit-source-id: f327559c1d063c9ce914a9afe2c1acf77c2aa287
      70f236a6
  8. 29 Apr, 2022 2 commits
  9. 26 Apr, 2022 4 commits
  10. 25 Apr, 2022 2 commits
  11. 22 Apr, 2022 1 commit
  12. 21 Apr, 2022 2 commits
    • Yanghan Wang's avatar
      use existing qconfig to create learnable qconfig · 9584b934
      Yanghan Wang authored
      Summary:
      Pull Request resolved: https://github.com/facebookresearch/d2go/pull/215
      
      Follow up the comment in D35631192 (https://github.com/facebookresearch/d2go/commit/3204f147d67fb2ce7ac2600c46708195c15bfa3a).
      
      The current `get_learnable_qat_qconfig` implementation mimics the default get qconfig functions, as commented "follow `default_per_channel_weight_fake_quant`", etc. Instead of creating custom qconfig from scratch, this diff change it to convert an existing qconfig to learnable, so that this process is transparent to the orthogonal change on the qconfig (eg. symmetric qscheme or new backend).
      
      The following shows the difference between learnable and non-learnable `QConfig` for `qnnpack` and `fbgemm`, the actual difference is just adding `use_grad_scaling=True` and change FakeQuant type from `FusedMovingAvgObsFakeQuantize` to `_LearnableFakeQuantize`. (maybe more obvious to copy to text editor compare show side-by-side)
      ````
      qat_utils.get_learnable_qat_qconfig("qnnpack")
      QConfig(
      	activation=functools.partial(
      		<class 'torch.ao.quantization._learnable_fake_quantize._LearnableFakeQuantize'>,
      		observer=<class 'torch.ao.quantization.observer.MovingAverageMinMaxObserver'>,
      		quant_min=0,
      		quant_max=255,
      		use_grad_scaling=True,
      		reduce_range=False
      	){},
      	weight=functools.partial(
      		<class 'torch.ao.quantization._learnable_fake_quantize._LearnableFakeQuantize'>,
      		observer=<class 'torch.ao.quantization.observer.MovingAverageMinMaxObserver'>,
      		quant_min=-128,
      		quant_max=127,
      		dtype=torch.qint8,
      		use_grad_scaling=True,
      		qscheme=torch.per_tensor_symmetric,
      		reduce_range=False
      	){}
      )
      
      torch.ao.quantization.get_default_qat_qconfig("qnnpack")
      QConfig(
      	activation=functools.partial(
      		<class 'torch.ao.quantization.fake_quantize.FusedMovingAvgObsFakeQuantize'>,
      		observer=<class 'torch.ao.quantization.observer.MovingAverageMinMaxObserver'>,
      		quant_min=0,
      		quant_max=255,
      
      		reduce_range=False
      	){},
      	weight=functools.partial(
      		<class 'torch.ao.quantization.fake_quantize.FusedMovingAvgObsFakeQuantize'>,
      		observer=<class 'torch.ao.quantization.observer.MovingAverageMinMaxObserver'>,
      		quant_min=-128,
      		quant_max=127,
      		dtype=torch.qint8,
      
      		qscheme=torch.per_tensor_symmetric,
      
      	){}
      )
      
      qat_utils.get_learnable_qat_qconfig("fbgemm")
      QConfig(
      	activation=functools.partial(
      		<class 'torch.ao.quantization._learnable_fake_quantize._LearnableFakeQuantize'>,
      		observer=<class 'torch.ao.quantization.observer.MovingAverageMinMaxObserver'>,
      		quant_min=0,
      		quant_max=255,
      		use_grad_scaling=True,
      		reduce_range=True
      	){},
      	weight=functools.partial(
      		<class 'torch.ao.quantization._learnable_fake_quantize._LearnableFakeQuantize'>,
      		observer=<class 'torch.ao.quantization.observer.MovingAveragePerChannelMinMaxObserver'>,
      		quant_min=-128,
      		quant_max=127,
      		dtype=torch.qint8,
      		use_grad_scaling=True,
      		qscheme=torch.per_channel_symmetric,
      		reduce_range=False,
      		ch_axis=0
      	){}
      )
      
      torch.ao.quantization.get_default_qat_qconfig("fbgemm")
      QConfig(
      	activation=functools.partial(
      		<class 'torch.ao.quantization.fake_quantize.FusedMovingAvgObsFakeQuantize'>,
      		observer=<class 'torch.ao.quantization.observer.MovingAverageMinMaxObserver'>,
      		quant_min=0,
      		quant_max=255,
      
      		reduce_range=True
      	){},
      	weight=functools.partial(
      		<class 'torch.ao.quantization.fake_quantize.FusedMovingAvgObsFakeQuantize'>,
      		observer=<class 'torch.ao.quantization.observer.MovingAveragePerChannelMinMaxObserver'>,
      		quant_min=-128,
      		quant_max=127,
      		dtype=torch.qint8,
      
      		qscheme=torch.per_channel_symmetric
      
      	){}
      )
      ```
      
      Reviewed By: kimishpatel
      
      Differential Revision: D35772970
      
      fbshipit-source-id: 0be8057e4f7ce3b315bd66d77aa88b733b676223
      9584b934
    • Owen Wang's avatar
      Fix Metal optimized models' augment with bundled inputs · c055a84f
      Owen Wang authored
      Summary:
      Pull Request resolved: https://github.com/facebookresearch/d2go/pull/216
      
      Sanity check after augment with bundled inputs fails unless tensor is moved to the correct backend.
      
      Fix warning where "-metal" or "-vulkan" is not correctly removed from the string.
      
      Temporary fix: Remove the call to augment with bundled inputs, because Metal backend for iOS GPU is not available on devserver. The true fix to unblock bundled inputs will be to add an input preformatting step op into metal models to convert input to Metal tensors (and no-op if already a metal tensor). This is outside the scope of this diff.
      
      Reviewed By: ymao1993
      
      Differential Revision: D35574266
      
      fbshipit-source-id: 9f7b5c72dff2e3cf0eddf871379b079a1ec658ff
      c055a84f
  13. 19 Apr, 2022 2 commits
    • Yanghan Wang's avatar
      consolidate the creation of qconfig · 3204f147
      Yanghan Wang authored
      Summary: Pull Request resolved: https://github.com/facebookresearch/d2go/pull/210
      
      Reviewed By: kimishpatel
      
      Differential Revision: D35631192
      
      fbshipit-source-id: a713d86734c6937c16c7ced705171db9ea2f0894
      3204f147
    • Lisa Roach's avatar
      apply import merging for fbcode/mobile-vision/d2go (3 of 4) · ae2f2f64
      Lisa Roach authored
      Summary:
      Pull Request resolved: https://github.com/facebookresearch/d2go/pull/212
      
      Applies new import merging and sorting from µsort v1.0.
      
      When merging imports, µsort will make a best-effort to move associated
      comments to match merged elements, but there are known limitations due to
      the diynamic nature of Python and developer tooling. These changes should
      not produce any dangerous runtime changes, but may require touch-ups to
      satisfy linters and other tooling.
      
      Note that µsort uses case-insensitive, lexicographical sorting, which
      results in a different ordering compared to isort. This provides a more
      consistent sorting order, matching the case-insensitive order used when
      sorting import statements by module name, and ensures that "frog", "FROG",
      and "Frog" always sort next to each other.
      
      For details on µsort's sorting and merging semantics, see the user guide:
      https://usort.readthedocs.io/en/stable/guide.html#sorting
      
      Reviewed By: jreese, wat3rBro
      
      Differential Revision: D35559673
      
      fbshipit-source-id: feeae2465ac2b62c44a0e92dc566e9a386567c9d
      ae2f2f64
  14. 15 Apr, 2022 2 commits
    • Zecheng He's avatar
      Align_corners cannot be set when mode is nearest · d4c58688
      Zecheng He authored
      Summary:
      Pull Request resolved: https://github.com/facebookresearch/d2go/pull/211
      
      Align_core cannot be set if the mode is nearest. Change to default None.
      
      Reviewed By: wat3rBro
      
      Differential Revision: D35681284
      
      fbshipit-source-id: 23c57112e588c0b4ac5facfd61a7af0aa8a07ef0
      d4c58688
    • Yanghan Wang's avatar
      enable moving traced model between devices · 2235f180
      Yanghan Wang authored
      Summary:
      X-link: https://github.com/facebookresearch/detectron2/pull/4132
      
      X-link: https://github.com/fairinternal/detectron2/pull/568
      
      Pull Request resolved: https://github.com/facebookresearch/d2go/pull/203
      
      For full discussion: https://fb.workplace.com/groups/1405155842844877/posts/5744470455580039
      
      Tracing the `.to(device)` will cause problem when moving the traced torchscript to another device (eg. from cpu to gpu, or even, from `cuda:0` to `cuda:1`). The reason is that `device` is not a `torch.Tensor`, so the tracer just hardcode the value during tracing. The solution is scripting the casting operation.
      
      Here's the code snippet illustrating this:
      ```
      # define the MyModel similar to GeneralizedRCNN, which casts the input to the model's device
      class MyModel(nn.Module):
          def __init__(self):
              super().__init__()
      
              self.conv1 = nn.Conv2d(3, 20, 5)
              self.conv2 = nn.Conv2d(20, 20, 5)
      
          def forward(self, x):
              # cast the input to the same device as this model, this makes it possible to
              # take a cpu tensor as input when the model is on GPU.
              x = x.to(self.conv1.weight.device)
      
              x = F.relu(self.conv1(x))
              return F.relu(self.conv2(x))
      
      # export the model by tracing
      model = MyModel()
      x = torch.zeros([1, 3, 32, 32])
      ts = torch.jit.trace(model, x)
      print(ts.graph)
      
      # =====================================================
      graph(%self.1 : __torch__.MyModel,
            %x : Float(1, 3, 32, 32, strides=[3072, 1024, 32, 1], requires_grad=0, device=cpu)):
        %conv2 : __torch__.torch.nn.modules.conv.___torch_mangle_0.Conv2d = prim::GetAttr[name="conv2"](%self.1)
        %conv1 : __torch__.torch.nn.modules.conv.Conv2d = prim::GetAttr[name="conv1"](%self.1)
        %14 : int = prim::Constant[value=6]() # <ipython-input-2-5abde0efc36f>:11:0
        %15 : int = prim::Constant[value=0]() # <ipython-input-2-5abde0efc36f>:11:0
        %16 : Device = prim::Constant[value="cpu"]() # <ipython-input-2-5abde0efc36f>:11:0
        %17 : NoneType = prim::Constant()
        %18 : bool = prim::Constant[value=0]() # <ipython-input-2-5abde0efc36f>:11:0
        %19 : bool = prim::Constant[value=0]() # <ipython-input-2-5abde0efc36f>:11:0
        %20 : NoneType = prim::Constant()
        %input.1 : Float(1, 3, 32, 32, strides=[3072, 1024, 32, 1], requires_grad=0, device=cpu) = aten::to(%x, %14, %15, %16, %17, %18, %19, %20) # <ipython-input-2-5abde0efc36f>:11:0
        %72 : Tensor = prim::CallMethod[name="forward"](%conv1, %input.1)
        %input.5 : Float(1, 20, 28, 28, strides=[15680, 784, 28, 1], requires_grad=1, device=cpu) = aten::relu(%72) # /mnt/xarfuse/uid-20293/a90d1698-seed-nspid4026533681_cgpid21128615-ns-4026533618/torch/nn/functional.py:1406:0
        %73 : Tensor = prim::CallMethod[name="forward"](%conv2, %input.5)
        %61 : Float(1, 20, 24, 24, strides=[11520, 576, 24, 1], requires_grad=1, device=cpu) = aten::relu(%73) # /mnt/xarfuse/uid-20293/a90d1698-seed-nspid4026533681_cgpid21128615-ns-4026533618/torch/nn/functional.py:1406:0
        return (%61)
      # =====================================================
      
      # PyTorch cuda works
      model = copy.deepcopy(model)
      model.to("cuda")
      y = model(x)
      # torchscript cpu works
      y = ts(x)
      # torchscript cuda doesn't work
      ts = ts.to("cuda")
      y = ts(x)
      
      # =====================================================
      RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor
      ---------------------------------------------------------------------------
      RuntimeError                              Traceback (most recent call last)
      <ipython-input-4-2aece3ad6c9a> in <module>
            7 # torchscript cuda doesn't work
            8 ts = ts.to("cuda")
      ----> 9 y = ts(x)
      /mnt/xarfuse/uid-20293/a90d1698-seed-nspid4026533681_cgpid21128615-ns-4026533618/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
         1108         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
         1109                 or _global_forward_hooks or _global_forward_pre_hooks):
      -> 1110             return forward_call(*input, **kwargs)
         1111         # Do not call functions when jit is used
         1112         full_backward_hooks, non_full_backward_hooks = [], []
      RuntimeError: The following operation failed in the TorchScript interpreter.
      # =====================================================
      
      # One solution is scripting the casting instead of tracing it, the folloing code demonstrate how to do it. We need to use mixed scripting/tracing
      torch.jit.script_if_tracing
      def cast_device_like(src: torch.Tensor, dst: torch.Tensor) -> torch.Tensor:
          return src.to(dst.device)
      
      class MyModel2(nn.Module):
          def __init__(self):
              super().__init__()
      
              self.conv1 = nn.Conv2d(3, 20, 5)
              self.conv2 = nn.Conv2d(20, 20, 5)
      
          def forward(self, x):
              # cast the input to the same device as this model, this makes it possible to
              # take a cpu tensor as input when the model is on GPU.
              x = cast_device_like(x, self.conv1.weight)
      
              x = F.relu(self.conv1(x))
              return F.relu(self.conv2(x))
      
      # export the model by tracing
      model = MyModel2()
      x = torch.zeros([1, 3, 32, 32])
      ts = torch.jit.trace(model, x)
      print(ts.graph)
      
      # =====================================================
      graph(%self.1 : __torch__.MyModel2,
            %x : Float(1, 3, 32, 32, strides=[3072, 1024, 32, 1], requires_grad=0, device=cpu)):
        %conv2 : __torch__.torch.nn.modules.conv.___torch_mangle_5.Conv2d = prim::GetAttr[name="conv2"](%self.1)
        %conv1 : __torch__.torch.nn.modules.conv.___torch_mangle_4.Conv2d = prim::GetAttr[name="conv1"](%self.1)
        %conv1.1 : __torch__.torch.nn.modules.conv.___torch_mangle_4.Conv2d = prim::GetAttr[name="conv1"](%self.1)
        %weight.5 : Tensor = prim::GetAttr[name="weight"](%conv1.1)
        %14 : Function = prim::Constant[name="cast_device_like"]()
        %input.1 : Tensor = prim::CallFunction(%14, %x, %weight.5)
        %68 : Tensor = prim::CallMethod[name="forward"](%conv1, %input.1)
        %input.5 : Float(1, 20, 28, 28, strides=[15680, 784, 28, 1], requires_grad=1, device=cpu) = aten::relu(%68) # /mnt/xarfuse/uid-20293/a90d1698-seed-nspid4026533681_cgpid21128615-ns-4026533618/torch/nn/functional.py:1406:0
        %69 : Tensor = prim::CallMethod[name="forward"](%conv2, %input.5)
        %55 : Float(1, 20, 24, 24, strides=[11520, 576, 24, 1], requires_grad=1, device=cpu) = aten::relu(%69) # /mnt/xarfuse/uid-20293/a90d1698-seed-nspid4026533681_cgpid21128615-ns-4026533618/torch/nn/functional.py:1406:0
        return (%55)
      # =====================================================
      
      # PyTorch cuda works
      model = copy.deepcopy(model)
      model.to("cuda")
      y = model(x)
      # torchscript cpu works
      y = ts(x)
      # Note that now torchscript cuda works
      ts = ts.to("cuda")
      y = ts(x)
      print(y.device)
      
      # =====================================================
      cuda:0
      # =====================================================
      ```
      
      For D2 (https://github.com/facebookresearch/d2go/commit/87374efb134e539090e0b5c476809dc35bf6aedb), this diff creates a `move_tensor_device_same_as_another(A, B)` function to replace `A.to(B.device)`. This diff updates the `rcnn.py` and all its utils.
      
      For D2 (https://github.com/facebookresearch/d2go/commit/87374efb134e539090e0b5c476809dc35bf6aedb)Go, since the exported model will become device-agnostic, we can remove the "_gpu" from predictor-type.
      
      Update (April 11):
      Add test to cover tracing on one device and move traced model to another device for inference. When GPU is available, it'll trace on `cuda:0` and run inference on `cpu`, `cuda:0` (and `cuda:N-1` if available).
      
      Summary of the device related patterns
      - The usage of `.to(dtype=another_dype)` won't affect device.
      - Explicit device casting like `.to(device)` can be generally replaced by `move_device_like`.
      - For creating variable directly on device (eg. `torch.zeros`, `torch.arange`), we can replace then with ScriptModule to avoid first create on CPU and then move to new device.
          - Creating things on tracing device and then moving to new device is dangerous, because tracing device (eg. `cuda:0`) might not be available (eg. running on CPU-only machine).
          - It's hard to write `image_list.py` in this pattern because the size behaves differently during tracing (int vs. scalar tensor), in this diff, still create on CPU first and then move to target device.
      
      Reviewed By: tglik
      
      Differential Revision: D35367772
      
      fbshipit-source-id: 02d07e3d96da85f4cfbeb996e3c14c2a6f619beb
      2235f180
  15. 12 Apr, 2022 1 commit
  16. 07 Apr, 2022 1 commit
    • Owen Wang's avatar
      add metal GPU to d2go export · 6b4dbb31
      Owen Wang authored
      Summary: Allow string name of export type to indicate which mobile opt backend user wants to trigger.
      
      Reviewed By: wat3rBro
      
      Differential Revision: D35375928
      
      fbshipit-source-id: dc3f91564681344e1d43862423ab5dc63b6644d3
      6b4dbb31
  17. 05 Apr, 2022 2 commits
    • Yanghan Wang's avatar
      support do_postprocess when tracing rcnn model in D2 style · 647a3fdf
      Yanghan Wang authored
      Summary:
      Pull Request resolved: https://github.com/facebookresearch/d2go/pull/200
      
      Currently when exporting the RCNN model, we call it with `self.model.inference(inputs, do_postprocess=False)[0]`, therefore the output of exported model is not post-processed, eg. the mask is in the squared shape. This diff adds the option to include postprocess in the exported model.
      
      Worth noting that since the input is a single tensor, the post-process doesn't resize the output to original resolution, and we can't apply the post-process twice to further resize it in the Predictor's PostProcessFunc, add an assertion to raise error in this case. But this is fine for most production use cases where the input is not resized.
      
      Set `RCNN_EXPORT.INCLUDE_POSTPROCESS` to `True` to enable this.
      
      Reviewed By: tglik
      
      Differential Revision: D34904058
      
      fbshipit-source-id: 65f120eadc9747e9918d26ce0bd7dd265931cfb5
      647a3fdf
    • Yanghan Wang's avatar
      refactor create_fake_detection_data_loader · 312c6b62
      Yanghan Wang authored
      Summary:
      Pull Request resolved: https://github.com/facebookresearch/d2go/pull/199
      
      - `create_fake_detection_data_loader` currently doesn't take `cfg` as input, sometimes we need to test the augmentation that needs more complicated different cfg.
      - name is a bit bad, rename it to `create_detection_data_loader_on_toy_dataset`.
      - width/height were the resized size previously, we want to change it to the size of data source (image files) and use `cfg` to control resized size.
      
      Update V3:
      In V2 there're some test failures, the reason is that V2 is building data loader (via GeneralizedRCNN runner) using actual test config instead of default config before this diff + dataset name change. In V3 we uses the test's runner instead of default runner for the consistency. This reveals some real bugs that we didn't test before.
      
      Reviewed By: omkar-fb
      
      Differential Revision: D35238890
      
      fbshipit-source-id: 28a6037374e74f452f91b494bd455b38d3a48433
      312c6b62
  18. 31 Mar, 2022 1 commit
  19. 30 Mar, 2022 1 commit
  20. 28 Mar, 2022 1 commit
  21. 25 Mar, 2022 1 commit
  22. 24 Mar, 2022 4 commits
  23. 22 Mar, 2022 1 commit
    • Owen Wang's avatar
      add .npy file handling in evaluator and visualizer · a0ee06f3
      Owen Wang authored
      Summary: Detectron2[Go]'s Visualizer and sem_seg_evaluation now updated with customization entrypoints for how to handle reading semantic seg masks. By default, PIL and PNG images are expected. However, some specific projects' datasets use .npy files and this customization allows providing an alternate Visualizer and evaluation function for reading them.
      
      Reviewed By: newstzpz
      
      Differential Revision: D33434948
      
      fbshipit-source-id: 42af16d6708ffc5b2c03ec8507757313e23c8204
      a0ee06f3