Commit 0c32f094 authored by Georgia Gkioxari's avatar Georgia Gkioxari Committed by Facebook GitHub Bot
Browse files

NDC/screen cameras API fix, compatibility with renderer

Summary:
API fix for NDC/screen cameras and compatibility with PyTorch3D renderers.

With this new fix:
* Users can define cameras and `transform_points` under any coordinate system conventions. The transformation applies the camera K and RT to the input points, not regarding for PyTorch3D conventions. So this makes cameras completely independent from PyTorch3D renderer.

* Cameras can be defined either in NDC space or screen space. For existing ones, FoV cameras are in NDC space. Perspective/Orthographic can be defined in NDC or screen space.

* The interface with PyTorch3D renderers happens through `transform_points_ndc` which transforms points to the NDC space and assumes that input points are provided according to PyTorch3D conventions.

* Similarly, `transform_points_screen` transforms points to screen space and again assumes that input points are under PyTorch3D conventions.

* For Orthographic/Perspective cameras, if they are defined in screen space, the `get_ndc_camera_transform` allows points to be converted to NDC for use for the renderers.

Reviewed By: nikhilaravi

Differential Revision: D26932657

fbshipit-source-id: 1a964e3e7caa54d10c792cf39c4d527ba2fb2e79
parent 9a14f54e
...@@ -13,7 +13,8 @@ This is the system the object/scene lives - the world. ...@@ -13,7 +13,8 @@ This is the system the object/scene lives - the world.
* **Camera view coordinate system** * **Camera view coordinate system**
This is the system that has its origin on the image plane and the `Z`-axis perpendicular to the image plane. In PyTorch3D, we assume that `+X` points left, and `+Y` points up and `+Z` points out from the image plane. The transformation from world to view happens after applying a rotation (`R`) and translation (`T`). This is the system that has its origin on the image plane and the `Z`-axis perpendicular to the image plane. In PyTorch3D, we assume that `+X` points left, and `+Y` points up and `+Z` points out from the image plane. The transformation from world to view happens after applying a rotation (`R`) and translation (`T`).
* **NDC coordinate system** * **NDC coordinate system**
This is the normalized coordinate system that confines in a volume the rendered part of the object/scene. Also known as view volume. Under the PyTorch3D convention, `(+1, +1, znear)` is the top left near corner, and `(-1, -1, zfar)` is the bottom right far corner of the volume. The transformation from view to NDC happens after applying the camera projection matrix (`P`). This is the normalized coordinate system that confines in a volume the rendered part of the object/scene. Also known as view volume. Under the PyTorch3D convention, `(+1, +1, znear)` is the top left near corner, and `(-1, -1, zfar)` is the bottom right far corner of the volume. For non-square volumes, the side of the volume in `XY` with the smallest length ranges from `[-1, 1]` while the larger side from `[-s, s]`, where `s` is the aspect ratio and `s > 1` (larger divided by smaller side).
The transformation from view to NDC happens after applying the camera projection matrix (`P`).
* **Screen coordinate system** * **Screen coordinate system**
This is another representation of the view volume with the `XY` coordinates defined in pixel space instead of a normalized space. This is another representation of the view volume with the `XY` coordinates defined in pixel space instead of a normalized space.
...@@ -22,47 +23,78 @@ An illustration of the 4 coordinate systems is shown below ...@@ -22,47 +23,78 @@ An illustration of the 4 coordinate systems is shown below
## Defining Cameras in PyTorch3D ## Defining Cameras in PyTorch3D
Cameras in PyTorch3D transform an object/scene from world to NDC by first transforming the object/scene to view (via transforms `R` and `T`) and then projecting the 3D object/scene to NDC (via the projection matrix `P`, else known as camera matrix). Thus, the camera parameters in `P` are assumed to be in NDC space. If the user has camera parameters in screen space, which is a common use case, the parameters should transformed to NDC (see below for an example) Cameras in PyTorch3D transform an object/scene from world to view by first transforming the object/scene to view (via transforms `R` and `T`) and then projecting the 3D object/scene to a normalized space via the projection matrix `P = K[R | T]`, where `K` is the intrinsic matrix. The camera parameters in `K` define the normalized space. If users define the camera parameters in NDC space, then the transform projects points to NDC. If the camera parameters are defined in screen space, the transformed points are in screen space.
We describe the camera types in PyTorch3D and the convention for the camera parameters provided at construction time. Note that the base `CamerasBase` class makes no assumptions about the coordinate systems. All the above transforms are geometric transforms defined purely by `R`, `T` and `K`. This means that users can define cameras in any coordinate system and for any transforms. The method `transform_points` will apply `K` , `R` and `T` to the input points as a simple matrix transformation. However, if users wish to use cameras with the PyTorch3D renderer, they need to abide to PyTorch3D's coordinate system assumptions (read below).
We provide instantiations of common camera types in PyTorch3D and how users can flexibly define the projection space below.
## Interfacing with the PyTorch3D Renderer
The PyTorch3D renderer for both meshes and point clouds assumes that the camera transformed points, meaning the points passed as input to the rasterizer, are in PyTorch3D's NDC space. So to get the expected rendering outcome, users need to make sure that their 3D input data and cameras abide by these PyTorch3D coordinate system assumptions. The PyTorch3D coordinate system assumes `+X:left`, `+Y: up` and `+Z: from us to scene` (right-handed) . Confusions regarding coordinate systems are common so we advise that you spend some time understanding your data and the coordinate system they live in and transform them accordingly before using the PyTorch3D renderer.
Examples of cameras and how they interface with the PyTorch3D renderer can be found in our tutorials.
### Camera Types ### Camera Types
All cameras inherit from `CamerasBase` which is a base class for all cameras. PyTorch3D provides four different camera types. The `CamerasBase` defines methods that are common to all camera models: All cameras inherit from `CamerasBase` which is a base class for all cameras. PyTorch3D provides four different camera types. The `CamerasBase` defines methods that are common to all camera models:
* `get_camera_center` that returns the optical center of the camera in world coordinates * `get_camera_center` that returns the optical center of the camera in world coordinates
* `get_world_to_view_transform` which returns a 3D transform from world coordinates to the camera view coordinates (R, T) * `get_world_to_view_transform` which returns a 3D transform from world coordinates to the camera view coordinates `(R, T)`
* `get_full_projection_transform` which composes the projection transform (P) with the world-to-view transform (R, T) * `get_full_projection_transform` which composes the projection transform (`K`) with the world-to-view transform `(R, T)`
* `transform_points` which takes a set of input points in world coordinates and projects to NDC coordinates ranging from [-1, -1, znear] to [+1, +1, zfar]. * `transform_points` which takes a set of input points in world coordinates and projects to NDC coordinates ranging from [-1, -1, znear] to [+1, +1, zfar].
* `get_ndc_camera_transform` which defines the conversion to PyTorch3D's NDC space and is called when interfacing with the PyTorch3D renderer. If the camera is defined in NDC space, then the identity transform is returned. If the cameras is defined in screen space, the conversion from screen to NDC is returned. If users define their own camera in screen space, they need to think of the screen to NDC conversion. We provide examples for the `PerspectiveCameras` and `OrthographicCameras`.
* `transform_points_ndc` which takes a set of points in world coordinates and projects them to PyTorch3D's NDC space
* `transform_points_screen` which takes a set of input points in world coordinates and projects them to the screen coordinates ranging from [0, 0, znear] to [W-1, H-1, zfar] * `transform_points_screen` which takes a set of input points in world coordinates and projects them to the screen coordinates ranging from [0, 0, znear] to [W-1, H-1, zfar]
Users can easily customize their own cameras. For each new camera, users should implement the `get_projection_transform` routine that returns the mapping `P` from camera view coordinates to NDC coordinates. Users can easily customize their own cameras. For each new camera, users should implement the `get_projection_transform` routine that returns the mapping `P` from camera view coordinates to NDC coordinates.
#### FoVPerspectiveCameras, FoVOrthographicCameras #### FoVPerspectiveCameras, FoVOrthographicCameras
These two cameras follow the OpenGL convention for perspective and orthographic cameras respectively. The user provides the near `znear` and far `zfar` field which confines the view volume in the `Z` axis. The view volume in the `XY` plane is defined by field of view angle (`fov`) in the case of `FoVPerspectiveCameras` and by `min_x, min_y, max_x, max_y` in the case of `FoVOrthographicCameras`. These two cameras follow the OpenGL convention for perspective and orthographic cameras respectively. The user provides the near `znear` and far `zfar` field which confines the view volume in the `Z` axis. The view volume in the `XY` plane is defined by field of view angle (`fov`) in the case of `FoVPerspectiveCameras` and by `min_x, min_y, max_x, max_y` in the case of `FoVOrthographicCameras`.
These cameras are by default in NDC space.
#### PerspectiveCameras, OrthographicCameras #### PerspectiveCameras, OrthographicCameras
These two cameras follow the Multi-View Geometry convention for cameras. The user provides the focal length (`fx`, `fy`) and the principal point (`px`, `py`). For example, `camera = PerspectiveCameras(focal_length=((fx, fy),), principal_point=((px, py),))` These two cameras follow the Multi-View Geometry convention for cameras. The user provides the focal length (`fx`, `fy`) and the principal point (`px`, `py`). For example, `camera = PerspectiveCameras(focal_length=((fx, fy),), principal_point=((px, py),))`
As mentioned above, the focal length and principal point are used to convert a point `(X, Y, Z)` from view coordinates to NDC coordinates, as follows The camera projection of a 3D point `(X, Y, Z)` in view coordinates to a point `(x, y, z)` in projection space (either NDC or screen) is
``` ```
# for perspective # for perspective camera
x_ndc = fx * X / Z + px x = fx * X / Z + px
y_ndc = fy * Y / Z + py y = fy * Y / Z + py
z_ndc = 1 / Z z = 1 / Z
# for orthographic # for orthographic camera
x_ndc = fx * X + px x = fx * X + px
y_ndc = fy * Y + py y = fy * Y + py
z_ndc = Z z = Z
```
The user can define the camera parameters in NDC or in screen space. Screen space camera parameters are common and for that case the user needs to set `in_ndc` to `False` and also provide the `image_size=(height, width)` of the screen, aka the image.
The `get_ndc_camera_transform` provides the transform from screen to NDC space in PyTorch3D. Note that the screen space assumes that the principal point is provided in the space with `+X left`, `+Y down` and origin at the top left corner of the image. To convert to NDC we need to account for the scaling of the normalized space as well as the change in `XY` direction.
Below are example of equivalent `PerspectiveCameras` instantiations in NDC and screen space, respectively.
```python
# NDC space camera
fcl_ndc = (1.2,)
prp_ndc = ((0.2, 0.5),)
cameras_ndc = PerspectiveCameras(focal_length=fcl_ndc, principal_point=prp_ndc)
# Screen space camera
image_size = ((128, 256),) # (h, w)
fcl_screen = (76.2,) # fcl_ndc * (min(image_size) - 1) / 2
prp_screen = ((114.8, 31.75), ) # (w - 1) / 2 - px_ndc * (min(image_size) - 1) / 2, (h - 1) / 2 - py_ndc * (min(image_size) - 1) / 2
cameras_screen = PerspectiveCameras(focal_length=fcl_screen, principal_point=prp_screen, in_ndc=False, image_size=image_size)
``` ```
Commonly, users have access to the focal length (`fx_screen`, `fy_screen`) and the principal point (`px_screen`, `py_screen`) in screen space. In that case, to construct the camera the user needs to additionally provide the `image_size = ((image_width, image_height),)`. More precisely, `camera = PerspectiveCameras(focal_length=((fx_screen, fy_screen),), principal_point=((px_screen, py_screen),), image_size = ((image_width, image_height),))`. Internally, the camera parameters are converted from screen to NDC as follows: The relationship between screen and NDC specifications of a camera's `focal_length` and `principal_point` is given by the following equations, where `s = min(image_width, image_height)`.
The transformation of x and y coordinates between screen and NDC is exactly the same as for px and py.
``` ```
fx = fx_screen * 2.0 / image_width fx_ndc = fx_screen * 2.0 / (s - 1)
fy = fy_screen * 2.0 / image_height fy_ndc = fy_screen * 2.0 / (s - 1)
px = - (px_screen - image_width / 2.0) * 2.0 / image_width px_ndc = - (px_screen - (image_width - 1) / 2.0) * 2.0 / (s - 1)
py = - (py_screen - image_height / 2.0) * 2.0/ image_height py_ndc = - (py_screen - (image_height - 1) / 2.0) * 2.0 / (s - 1)
``` ```
This diff is collapsed.
...@@ -73,8 +73,7 @@ class MeshRasterizer(nn.Module): ...@@ -73,8 +73,7 @@ class MeshRasterizer(nn.Module):
Args: Args:
cameras: A cameras object which has a `transform_points` method cameras: A cameras object which has a `transform_points` method
which returns the transformed points after applying the which returns the transformed points after applying the
world-to-view and view-to-screen world-to-view and view-to-ndc transformations.
transformations.
raster_settings: the parameters for rasterization. This should be a raster_settings: the parameters for rasterization. This should be a
named tuple. named tuple.
...@@ -100,8 +99,8 @@ class MeshRasterizer(nn.Module): ...@@ -100,8 +99,8 @@ class MeshRasterizer(nn.Module):
vertex coordinates in world space. vertex coordinates in world space.
Returns: Returns:
meshes_screen: a Meshes object with the vertex positions in screen meshes_proj: a Meshes object with the vertex positions projected
space in NDC space
NOTE: keeping this as a separate function for readability but it could NOTE: keeping this as a separate function for readability but it could
be moved into forward. be moved into forward.
...@@ -126,12 +125,14 @@ class MeshRasterizer(nn.Module): ...@@ -126,12 +125,14 @@ class MeshRasterizer(nn.Module):
verts_view = cameras.get_world_to_view_transform(**kwargs).transform_points( verts_view = cameras.get_world_to_view_transform(**kwargs).transform_points(
verts_world, eps=eps verts_world, eps=eps
) )
verts_screen = cameras.get_projection_transform(**kwargs).transform_points( # view to NDC transform
verts_view, eps=eps to_ndc_transform = cameras.get_ndc_camera_transform(**kwargs)
) projection_transform = cameras.get_projection_transform(**kwargs).compose(to_ndc_transform)
verts_screen[..., 2] = verts_view[..., 2] verts_ndc = projection_transform.transform_points(verts_view, eps=eps)
meshes_screen = meshes_world.update_padded(new_verts_padded=verts_screen)
return meshes_screen verts_ndc[..., 2] = verts_view[..., 2]
meshes_ndc = meshes_world.update_padded(new_verts_padded=verts_ndc)
return meshes_ndc
def forward(self, meshes_world, **kwargs) -> Fragments: def forward(self, meshes_world, **kwargs) -> Fragments:
""" """
...@@ -141,7 +142,7 @@ class MeshRasterizer(nn.Module): ...@@ -141,7 +142,7 @@ class MeshRasterizer(nn.Module):
Returns: Returns:
Fragments: Rasterization outputs as a named tuple. Fragments: Rasterization outputs as a named tuple.
""" """
meshes_screen = self.transform(meshes_world, **kwargs) meshes_proj = self.transform(meshes_world, **kwargs)
raster_settings = kwargs.get("raster_settings", self.raster_settings) raster_settings = kwargs.get("raster_settings", self.raster_settings)
# By default, turn on clip_barycentric_coords if blur_radius > 0. # By default, turn on clip_barycentric_coords if blur_radius > 0.
...@@ -166,7 +167,7 @@ class MeshRasterizer(nn.Module): ...@@ -166,7 +167,7 @@ class MeshRasterizer(nn.Module):
z_clip = None if not perspective_correct or znear is None else znear / 2 z_clip = None if not perspective_correct or znear is None else znear / 2
pix_to_face, zbuf, bary_coords, dists = rasterize_meshes( pix_to_face, zbuf, bary_coords, dists = rasterize_meshes(
meshes_screen, meshes_proj,
image_size=raster_settings.image_size, image_size=raster_settings.image_size,
blur_radius=raster_settings.blur_radius, blur_radius=raster_settings.blur_radius,
faces_per_pixel=raster_settings.faces_per_pixel, faces_per_pixel=raster_settings.faces_per_pixel,
......
...@@ -55,8 +55,7 @@ class PointsRasterizer(nn.Module): ...@@ -55,8 +55,7 @@ class PointsRasterizer(nn.Module):
""" """
cameras: A cameras object which has a `transform_points` method cameras: A cameras object which has a `transform_points` method
which returns the transformed points after applying the which returns the transformed points after applying the
world-to-view and view-to-screen world-to-view and view-to-ndc transformations.
transformations.
raster_settings: the parameters for rasterization. This should be a raster_settings: the parameters for rasterization. This should be a
named tuple. named tuple.
...@@ -76,8 +75,8 @@ class PointsRasterizer(nn.Module): ...@@ -76,8 +75,8 @@ class PointsRasterizer(nn.Module):
point_clouds: a set of point clouds point_clouds: a set of point clouds
Returns: Returns:
points_screen: the points with the vertex positions in screen points_proj: the points with positions projected
space in NDC space
NOTE: keeping this as a separate function for readability but it could NOTE: keeping this as a separate function for readability but it could
be moved into forward. be moved into forward.
...@@ -93,14 +92,17 @@ class PointsRasterizer(nn.Module): ...@@ -93,14 +92,17 @@ class PointsRasterizer(nn.Module):
# TODO: Remove this line when the convention for the z coordinate in # TODO: Remove this line when the convention for the z coordinate in
# the rasterizer is decided. i.e. retain z in view space or transform # the rasterizer is decided. i.e. retain z in view space or transform
# to a different range. # to a different range.
eps = kwargs.get("eps", None)
pts_view = cameras.get_world_to_view_transform(**kwargs).transform_points( pts_view = cameras.get_world_to_view_transform(**kwargs).transform_points(
pts_world pts_world, eps=eps
) )
pts_screen = cameras.get_projection_transform(**kwargs).transform_points( # view to NDC transform
pts_view to_ndc_transform = cameras.get_ndc_camera_transform(**kwargs)
) projection_transform = cameras.get_projection_transform(**kwargs).compose(to_ndc_transform)
pts_screen[..., 2] = pts_view[..., 2] pts_ndc = projection_transform.transform_points(pts_view, eps=eps)
point_clouds = point_clouds.update_padded(pts_screen)
pts_ndc[..., 2] = pts_view[..., 2]
point_clouds = point_clouds.update_padded(pts_ndc)
return point_clouds return point_clouds
def to(self, device): def to(self, device):
...@@ -115,10 +117,10 @@ class PointsRasterizer(nn.Module): ...@@ -115,10 +117,10 @@ class PointsRasterizer(nn.Module):
Returns: Returns:
PointFragments: Rasterization outputs as a named tuple. PointFragments: Rasterization outputs as a named tuple.
""" """
points_screen = self.transform(point_clouds, **kwargs) points_proj = self.transform(point_clouds, **kwargs)
raster_settings = kwargs.get("raster_settings", self.raster_settings) raster_settings = kwargs.get("raster_settings", self.raster_settings)
idx, zbuf, dists2 = rasterize_points( idx, zbuf, dists2 = rasterize_points(
points_screen, points_proj,
image_size=raster_settings.image_size, image_size=raster_settings.image_size,
radius=raster_settings.radius, radius=raster_settings.radius,
points_per_pixel=raster_settings.points_per_pixel, points_per_pixel=raster_settings.points_per_pixel,
......
...@@ -124,17 +124,23 @@ def ndc_to_screen_points_naive(points, imsize): ...@@ -124,17 +124,23 @@ def ndc_to_screen_points_naive(points, imsize):
Transforms points from PyTorch3D's NDC space to screen space Transforms points from PyTorch3D's NDC space to screen space
Args: Args:
points: (N, V, 3) representing padded points points: (N, V, 3) representing padded points
imsize: (N, 2) image size = (width, height) imsize: (N, 2) image size = (height, width)
Returns: Returns:
(N, V, 3) tensor of transformed points (N, V, 3) tensor of transformed points
""" """
imwidth, imheight = imsize.unbind(1) height, width = imsize.unbind(1)
imwidth = imwidth.view(-1, 1) width = width.view(-1, 1)
imheight = imheight.view(-1, 1) half_width = (width - 1.0) / 2.0
height = height.view(-1, 1)
half_height = (height - 1.0) / 2.0
scale = (
half_width * (height > width).float() + half_height * (height <= width).float()
)
x, y, z = points.unbind(2) x, y, z = points.unbind(2)
x = (1.0 - x) * (imwidth - 1) / 2.0 x = -scale * x + half_width
y = (1.0 - y) * (imheight - 1) / 2.0 y = -scale * y + half_height
return torch.stack((x, y, z), dim=2) return torch.stack((x, y, z), dim=2)
...@@ -513,17 +519,23 @@ class TestCamerasCommon(TestCaseMixin, unittest.TestCase): ...@@ -513,17 +519,23 @@ class TestCamerasCommon(TestCaseMixin, unittest.TestCase):
screen_cam_params = {"R": R, "T": T} screen_cam_params = {"R": R, "T": T}
ndc_cam_params = {"R": R, "T": T} ndc_cam_params = {"R": R, "T": T}
if cam_type in (OrthographicCameras, PerspectiveCameras): if cam_type in (OrthographicCameras, PerspectiveCameras):
ndc_cam_params["focal_length"] = torch.rand((batch_size, 2)) * 3.0 fcl = torch.rand((batch_size, 2)) * 3.0 + 0.1
ndc_cam_params["principal_point"] = torch.randn((batch_size, 2)) prc = torch.randn((batch_size, 2)) * 0.2
# (height, width)
image_size = torch.randint(low=2, high=64, size=(batch_size, 2)) image_size = torch.randint(low=2, high=64, size=(batch_size, 2))
# scale
scale = (image_size.min(dim=1, keepdim=True).values - 1.0) / 2.0
ndc_cam_params["focal_length"] = fcl
ndc_cam_params["principal_point"] = prc
ndc_cam_params["image_size"] = image_size
screen_cam_params["image_size"] = image_size screen_cam_params["image_size"] = image_size
screen_cam_params["focal_length"] = ( screen_cam_params["focal_length"] = fcl * scale
ndc_cam_params["focal_length"] * image_size / 2.0
)
screen_cam_params["principal_point"] = ( screen_cam_params["principal_point"] = (
(1.0 - ndc_cam_params["principal_point"]) * image_size / 2.0 image_size[:, [1, 0]] - 1.0
) ) / 2.0 - prc * scale
screen_cam_params["in_ndc"] = False
else: else:
raise ValueError(str(cam_type)) raise ValueError(str(cam_type))
return cam_type(**ndc_cam_params), cam_type(**screen_cam_params) return cam_type(**ndc_cam_params), cam_type(**screen_cam_params)
...@@ -611,17 +623,22 @@ class TestCamerasCommon(TestCaseMixin, unittest.TestCase): ...@@ -611,17 +623,22 @@ class TestCamerasCommon(TestCaseMixin, unittest.TestCase):
# init the cameras # init the cameras
cameras = init_random_cameras(cam_type, batch_size) cameras = init_random_cameras(cam_type, batch_size)
# xyz - the ground truth point cloud # xyz - the ground truth point cloud
xyz = torch.randn(batch_size, num_points, 3) * 0.3 xy = torch.randn(batch_size, num_points, 2) * 2.0 - 1.0
z = torch.randn(batch_size, num_points, 1) * 3.0 + 1.0
xyz = torch.cat((xy, z), dim=2)
# image size # image size
image_size = torch.randint(low=2, high=64, size=(batch_size, 2)) image_size = torch.randint(low=32, high=64, size=(batch_size, 2))
# project points # project points
xyz_project_ndc = cameras.transform_points(xyz) xyz_project_ndc = cameras.transform_points_ndc(xyz)
xyz_project_screen = cameras.transform_points_screen(xyz, image_size) xyz_project_screen = cameras.transform_points_screen(
xyz, image_size=image_size
)
# naive # naive
xyz_project_screen_naive = ndc_to_screen_points_naive( xyz_project_screen_naive = ndc_to_screen_points_naive(
xyz_project_ndc, image_size xyz_project_ndc, image_size
) )
self.assertClose(xyz_project_screen, xyz_project_screen_naive) # we set atol to 1e-4, remember that screen points are in [0, W-1]x[0, H-1] space
self.assertClose(xyz_project_screen, xyz_project_screen_naive, atol=1e-4)
def test_equiv_project_points(self, batch_size=50, num_points=100): def test_equiv_project_points(self, batch_size=50, num_points=100):
""" """
...@@ -634,12 +651,15 @@ class TestCamerasCommon(TestCaseMixin, unittest.TestCase): ...@@ -634,12 +651,15 @@ class TestCamerasCommon(TestCaseMixin, unittest.TestCase):
ndc_cameras, ndc_cameras,
screen_cameras, screen_cameras,
) = TestCamerasCommon.init_equiv_cameras_ndc_screen(cam_type, batch_size) ) = TestCamerasCommon.init_equiv_cameras_ndc_screen(cam_type, batch_size)
# xyz - the ground truth point cloud # xyz - the ground truth point cloud in Py3D space
xyz = torch.randn(batch_size, num_points, 3) * 0.3 xy = torch.randn(batch_size, num_points, 2) * 0.3
z = torch.rand(batch_size, num_points, 1) + 3.0 + 0.1
xyz = torch.cat((xy, z), dim=2)
# project points # project points
xyz_ndc_cam = ndc_cameras.transform_points(xyz) xyz_ndc = ndc_cameras.transform_points_ndc(xyz)
xyz_screen_cam = screen_cameras.transform_points(xyz) xyz_screen = screen_cameras.transform_points_ndc(xyz)
self.assertClose(xyz_ndc_cam, xyz_screen_cam, atol=1e-6) # check correctness
self.assertClose(xyz_ndc, xyz_screen, atol=1e-5)
def test_clone(self, batch_size: int = 10): def test_clone(self, batch_size: int = 10):
""" """
......
...@@ -255,9 +255,20 @@ class TestRenderMeshes(TestCaseMixin, unittest.TestCase): ...@@ -255,9 +255,20 @@ class TestRenderMeshes(TestCaseMixin, unittest.TestCase):
device=device, device=device,
R=R, R=R,
T=T, T=T,
principal_point=((256.0, 256.0),), principal_point=(
focal_length=((256.0, 256.0),), (
(512.0 - 1.0) / 2.0,
(512.0 - 1.0) / 2.0,
),
),
focal_length=(
(
(512.0 - 1.0) / 2.0,
(512.0 - 1.0) / 2.0,
),
),
image_size=((512, 512),), image_size=((512, 512),),
in_ndc=False,
) )
rasterizer = MeshRasterizer( rasterizer = MeshRasterizer(
cameras=cameras, raster_settings=raster_settings cameras=cameras, raster_settings=raster_settings
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment