"doc/git@developer.sourcefind.cn:OpenDAS/ktransformers.git" did not exist on "bb6920ed72241556b87f1fea704180143af2c997"
Commit b3060c7a authored by Stanislav Pidhorskyi's avatar Stanislav Pidhorskyi Committed by Facebook GitHub Bot
Browse files

Docstrings improvements

Summary: As title says. The is for the sphinx documentation.

Reviewed By: HapeMask

Differential Revision: D63440496

fbshipit-source-id: 483fdfc6cbc14ce8f88e6d048553488f1a0f8ed3
parent b0ca8b5c
...@@ -24,65 +24,97 @@ def edge_grad_estimator( ...@@ -24,65 +24,97 @@ def edge_grad_estimator(
index_img: th.Tensor, index_img: th.Tensor,
v_pix_img_hook: Optional[Callable[[th.Tensor], None]] = None, v_pix_img_hook: Optional[Callable[[th.Tensor], None]] = None,
) -> th.Tensor: ) -> th.Tensor:
""" """Makes the rasterized image ``img`` differentiable at visibility discontinuities
Args: and backpropagates the gradients to ``v_pix``.
v_pix: Pixel-space vertex coordinates with preserved camera-space Z-values.
N x V x 3 This function takes a rasterized image ``img`` that is assumed to be differentiable at
continuous regions but not at discontinuities. In some cases, ``img`` may not be differentiable
vi: face vertex index list tensor at all. For example, if the image is a rendered segmentation mask, it remains constant at
V x 3 continuous regions, making it non-differentiable. However, ``edge_grad_estimator`` can still
compute gradients at the discontinuities with respect to ``v_pix``.
bary_img: 3D barycentric coordinate image tensor
N x 3 x H x W The arguments ``bary_img`` and ``index_img`` must correspond exactly to the rasterized image
``img``. Each pixel in ``img`` should correspond to a fragment originated prom primitive
img: The rendered image specified by ``index_img`` and it should have barycentric coordinates specified by
N x C x H x W ``bary_img``. This means that with a small change to ``v_pix``, the pixels in ``img`` should
change accordingly. A frequent mistake that violates this condition is applying a mask
index_img: index image tensor to the rendered image to exclude unwanted regions, which leads to erroneous gradients.
N x H x W
The function returns the ``img`` unchanged but with added differentiability at the
discontinuities. Note that it is not necessary for the input ``img`` to require gradients,
but the returned ``img`` will always require gradients.
v_pix_img_hook: a backward hook that will be registered to v_pix_img. Useful for examining Args:
generated image space. Default None v_pix (Tensor): Pixel-space vertex coordinates, preserving the original camera-space
Z-values. Shape: :math:`(N, V, 3)`.
vi (Tensor): Face vertex index list tensor. Shape: :math:`(V, 3)`.
bary_img (Tensor): 3D barycentric coordinate image tensor. Shape: :math:`(N, 3, H, W)`.
img (Tensor): The rendered image. Shape: :math:`(N, C, H, W)`.
index_img (Tensor): Index image tensor. Shape: :math:`(N, H, W)`.
v_pix_img_hook (Optional[Callable[[th.Tensor], None]]): An optional backward hook that will
be registered to ``v_pix_img``. Useful for examining the generated image space. Default
is None.
Returns: Returns:
returns the img argument unchanged. Optionally also returns computed Tensor: Returns the input ``img`` unchanged. However, the returned image now has added
v_pix_img. Your loss should use the returned img, even though it is differentiability at visibility discontinuities. This returned image should be used for
unchanged. computing losses
Note: Note:
It is important to not spatially modify the rasterized image before passing it to edge_grad_estimator. It is crucial not to spatially modify the rasterized image before passing it to
Any operation as long as it is differentiable is ok after the edge_grad_estimator. `edge_grad_estimator`. That stems from the requirement that ``bary_img`` and ``index_img``
must correspond exactly to the rasterized image ``img``. That means that the location of all
discontinuities is controlled by ``v_pix`` and can be modified by modifing ``v_pix``.
Examples of opeartions that can be done before edge_grad_estimator: Operations that are allowed, as long as they are differentiable, include:
- Pixel-wise MLP - Pixel-wise MLP
- Color mapping - Color mapping
- Color correction, gamma correction - Color correction, gamma correction
If the operation is significantly non-linear, then it is recommended to do it before - Anything that would be indistinguishable from processing fragments independently
edge_grad_estimator. All sorts of clipping and clamping (e.g. `x.clamp(min=0.0, max=1.0)`), must be before their values get assigned to pixels of ``img``
done before edge_grad_estimator.
Examples of operations that are not allowed before edge_grad_estimator: Operations that **must be avoided** before `edge_grad_estimator` include:
- Gaussian blur - Gaussian blur
- Warping, deformation - Warping or deformation
- Masking, cropping, making holes. - Masking, cropping, or introducing holes
There is however, no issue with appling them after `edge_grad_estimator`.
If the operation is highly non-linear, it is recommended to perform it before calling
:func:`edge_grad_estimator`.
All sorts of clipping and clamping (e.g., `x.clamp(min=0.0, max=1.0)`) must also be done
before invoking this function.
Usage:: Usage Example::
from drtk.renderlayer import edge_grad_estimator import torch.nn.functional as thf
from drtk import transform, rasterize, render, interpolate, edge_grad_estimator
... ...
out = renderlayer(v, tex, campos, camrot, focal, princpt, v_pix = transform(v, tex, campos, camrot, focal, princpt)
output_filters=["index_img", "render", "mask", "bary_img", "v_pix"]) index_img = rasterize(v_pix, vi, width=512, height=512)
_, bary_img = render(v_pix, vi, index_img)
vt_img = interpolate(vt, vti, index_img, bary_img)
img = thf.grid_sample(
tex,
vt_img.permute(0, 2, 3, 1),
mode="bilinear",
padding_mode="border",
align_corners=False
)
mask = (index_img != -1)[:, None, :, :]
img = out["render"] * out["mask"] img = img * mask
img = edge_grad_estimator( img = edge_grad_estimator(
v_pix=out["v_pix"], v_pix=v_pix,
vi=rl.vi, vi=vi,
bary_img=out["bary_img"], bary_img=bary_img,
img=img, img=img,
index_img=out["index_img"] index_img=index_img
) )
optim.zero_grad() optim.zero_grad()
...@@ -91,7 +123,10 @@ def edge_grad_estimator( ...@@ -91,7 +123,10 @@ def edge_grad_estimator(
optim.step() optim.step()
""" """
# Could use v_pix_img output from DRTK, but bary_img needs to be detached. # TODO: avoid call to interpolate, use backward kernel of interpolate directly
# Doing so will make `edge_grad_estimator` zero-overhead in forward pass
# At the moment, value of `v_pix_img` is ignored, and only passed to
# edge_grad_estimator so that backward kernel can be called with the computed gradient.
v_pix_img = interpolate(v_pix, vi, index_img, bary_img.detach()) v_pix_img = interpolate(v_pix, vi, index_img, bary_img.detach())
img = th.ops.edge_grad_ext.edge_grad_estimator(v_pix, v_pix_img, vi, img, index_img) img = th.ops.edge_grad_ext.edge_grad_estimator(v_pix, v_pix_img, vi, img, index_img)
...@@ -111,7 +146,7 @@ def edge_grad_estimator_ref( ...@@ -111,7 +146,7 @@ def edge_grad_estimator_ref(
) -> th.Tensor: ) -> th.Tensor:
""" """
Python reference implementation for Python reference implementation for
:func:`drtk.edge_grad_estimator.edge_grad_estimator`. :func:`drtk.edge_grad_estimator`.
""" """
# could use v_pix_img output from DRTK, but bary_img needs to be detached. # could use v_pix_img output from DRTK, but bary_img needs to be detached.
......
...@@ -3,6 +3,11 @@ ...@@ -3,6 +3,11 @@
# This source code is licensed under the MIT license found in the # This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree. # LICENSE file in the root directory of this source tree.
"""
``drtk.interpolate`` module provides functions for differentiable interpolation of vertex
attributes across the fragments, e.i. pixels covered by the primitive.
"""
import torch as th import torch as th
from drtk import interpolate_ext from drtk import interpolate_ext
...@@ -18,6 +23,7 @@ def interpolate( ...@@ -18,6 +23,7 @@ def interpolate(
) -> th.Tensor: ) -> th.Tensor:
""" """
Performs a linear interpolation of the vertex attributes given the barycentric coordinates Performs a linear interpolation of the vertex attributes given the barycentric coordinates
Args: Args:
vert_attributes (th.Tensor): vertex attribute tensor vert_attributes (th.Tensor): vertex attribute tensor
N x V x C N x V x C
...@@ -27,12 +33,14 @@ def interpolate( ...@@ -27,12 +33,14 @@ def interpolate(
N x H x W N x H x W
bary_img (th.Tensor): 3D barycentric coordinate image tensor bary_img (th.Tensor): 3D barycentric coordinate image tensor
N x 3 x H x W N x 3 x H x W
Returns: Returns:
A tensor with interpolated vertex attributes with a shape [N, C, H, W] A tensor with interpolated vertex attributes with a shape [N, C, H, W]
Note:
1. The default of `channels_last` is set to true to make this function backward compatible. .. warning::
Please consider using the argument `channels_last` instead of permuting the result afterward. The returned tensor has only valid values for pixels which have a valid index in ``index_img``.
2. By default, the output is not contiguous. Make sure you cal .contiguous() if that is a requirement. For all other pixels, which had index ``-1`` in ``index_img``, the returned tensor will have non-zero
values which should be ignored.
""" """
return th.ops.interpolate_ext.interpolate(vert_attributes, vi, index_img, bary_img) return th.ops.interpolate_ext.interpolate(vert_attributes, vi, index_img, bary_img)
...@@ -44,7 +52,8 @@ def interpolate_ref( ...@@ -44,7 +52,8 @@ def interpolate_ref(
bary_img: th.Tensor, bary_img: th.Tensor,
) -> th.Tensor: ) -> th.Tensor:
""" """
A reference implementation for `interpolate`. See the doc string from `interpolate` A reference implementation of :func:`drtk.interpolate` in pure PyTorch.
This function is used for tests only, please see :func:`drtk.interpolate` for documentation.
""" """
# Run reference implementation in double precision to get as good reference as possible # Run reference implementation in double precision to get as good reference as possible
......
...@@ -24,36 +24,38 @@ def rasterize( ...@@ -24,36 +24,38 @@ def rasterize(
Rasterizes a mesh defined by v and vi. Rasterizes a mesh defined by v and vi.
Args: Args:
v (th.Tensor): vertex positions. The first two components are the projected vertex's location (x, y) v (th.Tensor): vertex positions. The first two components are the projected vertex's
on the image plane. The coordinates of the top left corner are (-0.5, -0.5), and the coordinates of location (x, y) on the image plane. The coordinates of the top left corner are
the bottom right corner are (width - 0.5, height - 0.5). The z component is expected to be in the (-0.5, -0.5), and the coordinates of the bottom right corner are
camera space coordinate frame (before projection). (width - 0.5, height - 0.5). The z component is expected to be in the camera space
coordinate frame (before projection).
N x V x 3 N x V x 3
vi (th.Tensor): face vertex index list tensor. The most significant nibble of vi is reserved for vi (th.Tensor): face vertex index list tensor. The most significant nibble of vi is
controlling visibility of the edges in wireframe mode. In non-wireframe mode, content of the most reserved for controlling visibility of the edges in wireframe mode. In non-wireframe
significant nibble of vi will be ignored. mode, content of the most significant nibble of vi will be ignored.
V x 3 V x 3
height (int): height of the image in pixels. height (int): height of the image in pixels.
width (int): width of the image in pixels. width (int): width of the image in pixels.
wireframe (bool): If False (default), rasterizes triangles. If True, rasterizes lines, where the most wireframe (bool): If False (default), rasterizes triangles. If True, rasterizes lines,
significant nibble of vi is reinterpreted as a bit field controlling the visibility of the edges. The where the most significant nibble of vi is reinterpreted as a bit field controlling
least significant bit controls the visibility of the first edge, the second bit controls the the visibility of the edges. The least significant bit controls the visibility of the
visibility of the second edge, and the third bit controls the visibility of the third edge. This first edge, the second bit controls the visibility of the second edge, and the third
limits the maximum number of vertices to 268435455. bit controls the visibility of the third edge. This limits the maximum number of
vertices to 268435455.
Returns: Returns:
The rasterized image of triangle indices which is represented with an index tensor of a shape The rasterized image of triangle indices which is represented with an index tensor of a
[N, H, W] of type int32 that stores a triangle ID for each pixel. If a triangle covers a pixel and is shape [N, H, W] of type int32 that stores a triangle ID for each pixel. If a triangle
the closest triangle to the camera, then the pixel will have the ID of that triangle. If no triangles covers a pixel and is the closest triangle to the camera, then the pixel will have the
cover a pixel, then its ID is -1. ID of that triangle. If no triangles cover a pixel, then its ID is -1.
Note: Note:
This function is not differentiable. The gradients should be computed with `edge_grad_estimator` This function is not differentiable. The gradients should be computed with
instead. :func:`edge_grad_estimator` instead.
""" """
_, index_img = th.ops.rasterize_ext.rasterize(v, vi, height, width, wireframe) _, index_img = th.ops.rasterize_ext.rasterize(v, vi, height, width, wireframe)
return index_img return index_img
...@@ -68,22 +70,24 @@ def rasterize_with_depth( ...@@ -68,22 +70,24 @@ def rasterize_with_depth(
wireframe: bool = False, wireframe: bool = False,
) -> Tuple[th.Tensor, th.Tensor]: ) -> Tuple[th.Tensor, th.Tensor]:
""" """
Same as `rasterize` function, additionally returns depth image. Internally it uses the same implementation Same as :func:`rasterize` function, additionally returns depth image. Internally it uses the
as the rasterize function which still computes depth but does not return depth. same implementation as the rasterize function which still computes depth but does not return
depth.
Notes: Note:
The function is not differentiable, including the depth output. The function is not differentiable, including the depth output.
The split is done intentionally to hide the depth image from the user as it is not differentiable which The split is done intentionally to hide the depth image from the user as it is not
may cause errors if assumed otherwise. Instead, the`barycentrics` function should be used instead to differentiable which may cause errors if assumed otherwise. Instead, the`barycentrics` function
should be used instead to
compute the differentiable version of depth. compute the differentiable version of depth.
However, we still provide `rasterize_with_depth` which returns non-differentiable depth which could allow However, we still provide `rasterize_with_depth` which returns non-differentiable depth which
to avoid call to `barycentrics` function when differentiability is not required. could allow to avoid call to `barycentrics` function when differentiability is not required.
Returns: Returns:
The rasterized image of triangle indices of shape [N, H, W] and a depth image of shape [N, H, W]. The rasterized image of triangle indices of shape [N, H, W] and a depth image of shape
Values in of pixels in the depth image not covered by any pixel are 0. [N, H, W]. Values in of pixels in the depth image not covered by any pixel are 0.
""" """
depth_img, index_img = th.ops.rasterize_ext.rasterize( depth_img, index_img = th.ops.rasterize_ext.rasterize(
......
...@@ -22,38 +22,37 @@ def transform( ...@@ -22,38 +22,37 @@ def transform(
fov: Optional[th.Tensor] = None, fov: Optional[th.Tensor] = None,
) -> th.Tensor: ) -> th.Tensor:
""" """
v: Tensor, N x V x 3 Projects 3D vertex positions onto the image plane of the camera.
Batch of vertex positions for vertices in the mesh.
Args:
campos: Tensor, N x 3 v (th.Tensor): vertex positions. N x V x 3
Camera position. campos (Tensor): Camera position. N x 3
camrot (Tensor): Camera rotation matrix. N x 3 x 3
camrot: Tensor, N x 3 x 3 focal (Tensor): Focal length. The upper left 2x2 block of the intrinsic matrix
Camera rotation matrix. [[f_x, s], [0, f_y]]. N x 2 x 2
princpt (Tensor): Camera principal point [cx, cy]. N x 2
focal: Tensor, N x 2 x 2 K (Tensor): Camera intrinsic calibration matrix, N x 3 x 3
Focal length [[fx, 0], Rt (Tensor): Camera extrinsic matrix. N x 3 x 4 or N x 4 x 4
[0, fy]] distortion_mode (List[str]): Names of the distortion modes.
distortion_coeff (Tensor): Distortion coefficients. N x 4
princpt: Tensor, N x 2 fov (Tensor): Valid field of view of the distortion model. N x 1
Principal point [cx, cy]
Returns:
K: Tensor, N x 3 x 3 Vertex positions projected onto the image plane of the camera. The last dimension has
Camera intrinsic calibration matrix. Either this or both (focal, still size 3. The first two components are the x and y coordinates on the image plane,
princpt) must be provided. and the z is z component of the vertex positions in the camera frame. The latter is used
for depth values that are written to the z-buffer. N x V x 3
Rt: Tensor, N x 3 x 4 or N x 4 x 4
Camera extrinsic matrix. Either this or both (camrot, campos) must be .. warning::
provided. Camrot is the upper 3x3 of Rt, campos = -R.T @ t. You must specify either ``K`` (intrinsic matrix) or both ``focal`` and ``princpt``
(focal length and principal point).
distortion_mode: List[str]
Names of the distortion modes. Additionally, you must provide either ``Rt`` (extrinsic matrix) or both ``campos``
(camera position) and ``camrot`` (camera rotation).
distortion_coeff: Tensor, N x 4
Distortion coefficients. .. note::
If we split ``Rt`` of shape N x 3 x 4 into ``R`` of shape N x 3 x 3 and ``t`` of
fov: Tensor, N x 1 shape N x 3 x 1, then: ``camrot`` is ``R``, and ``campos`` is ``-R.T @ t``.
Valid field of view of the distortion model.
""" """
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment