initial commit

92aa2fa8 · zachteed · 92aa2fa8 · 92aa2fa8 · 92aa2fa8 · 92aa2fa8
Commit 92aa2fa8 authored Mar 22, 2021 by zachteed
20 changed files
--- a/.gitignore
+++ b/.gitignore
+__pycache__
+build
+dist
+*.egg-info
+*.vscode/
+*.pth
+tests
+checkpoints
+datasets
+runs
+a.out
+cache
+*.g2o
+
--- a/.gitmodules
+++ b/.gitmodules
+[submodule "eigen"]
+	path = eigen
+	url = https://gitlab.com/libeigen/eigen
--- a/LICENSE
+++ b/LICENSE
+BSD 3-Clause License
+
+Copyright (c) 2021, princeton-vl
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+* Redistributions of source code must retain the above copyright notice, this
+  list of conditions and the following disclaimer.
+
+* Redistributions in binary form must reproduce the above copyright notice,
+  this list of conditions and the following disclaimer in the documentation
+  and/or other materials provided with the distribution.
+
+* Neither the name of the copyright holder nor the names of its
+  contributors may be used to endorse or promote products derived from
+  this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
+FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
--- a/README.md
+++ b/README.md
+# LieTorch: Tangent Space Backpropogation
+
+
+## Introduction
+
+The LieTorch library generalizes PyTorch to 3D transformation groups. Just as `torch.Tensor` is a multi-dimensional matrix of scalar elements, `lietorch.SE3` is a multi-dimensional matrix of SE3 elements. We support common tensor manipulations such as indexing, reshaping, and broadcasting. Group operations can be composed into computation graphs and backpropogation is automatically peformed in the tangent space of each element. For more details, please see our paper:
+
+<center><img src="lietorch.png" width="480" style="center"></center>
+
+Tangent Space Backpropogation for 3D Transformation Groups  
+Zachary Teed and Jia Deng, CVPR 2021
+
+## Installation
+
+
+### Requirements: 
+ * Cuda >= 10.1 (with nvcc compiler)
+ * PyTorch >= 1.6
+
+We recommend installing within a virtual enviornment. Make sure you clone using the `--recursive` flag. If you are using Anaconda, the following command can be used to install all dependencies
+```
+git clone --recursive https://github.com/princeton-vl/lietorch.git
+cd lietorch
+
+conda create -n lie_env
+conda activate lie_env
+conda install scipy pyyaml pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
+```
+
+To run the examples, you will need OpenCV and Open3D. Depending on your operating system, OpenCV and Open3D can either be installed with pip or may need to be built from source
+```
+pip install opencv-python open3d
+```
+
+### Installing:
+
+Clone the repo using the `--recursive` flag and install using `setup.py` (may take up to 10 minutes)
+```
+git clone --recursive https://github.com/princeton-vl/lietorch.git
+python setup.py install
+./run_tests.sh
+```
+
+## Overview
+
+LieTorch currently supports the 3D transformation groups. 
+
+| Group  | Dimension | Action |
+| -------| --------- | ------------- |
+| SO3    | 3  | rotation |
+| RxSO3  | 4  | rotation + scaling |
+| SE3    | 6  | rotation + translation |
+| Sim3   | 7  | rotation + translation + scaling |
+
+Each group supports the following operations:
+
+| Operation | Map | Description |
+| -------| --------| ------------- |
+| exp    | g -> G | exponential map |
+| log    | G -> g | logarithm map |
+| inv    | G -> G | group inverse |
+| mul   | G x G -> G | group multiplication |
+| adj    | G x g -> g | adjoint |
+| adjT   | G x g*-> g* | dual adjoint |
+| act    | G x R3 -> R3 | action on point (set) |
+| act4   | G x P3 -> P3 | action on homogeneous point (set) |
+
+&nbsp;
+#### Simple Example:
+Compute the angles between all pairs of rotation matrices
+
+```python
+import torch
+from lietorch import SO3
+
+phi = torch.randn(8000, 3, device='cuda', requires_grad=True)
+R = SO3.exp(phi)
+
+# relative rotation matrix, SO3 ^ {100 x 100}
+dR = R[:,None].inv() * R[None,:]
+
+# 100x100 matrix of angles
+ang = dR.log().norm(dim=-1)
+
+# backpropogation in tangent space
+loss = ang.sum()
+loss.backward()
+```
+
+## Examples
+We provide real use cases in the examples directory
+1. Pose Graph Optimization
+2. Deep SE3/Sim3 Registrtion
+3. RGB-D SLAM / VO
+
+### Acknowledgements
+Many of the Lie Group implementations are adapted from [Sophus](https://github.com/strasdat/Sophus). 
--- a/eigen @ 824272cd
+++ b/eigen @ 824272cd
+Subproject commit 824272cde8ca2541e8b67b0887f5ded92b128d1f
--- a/examples/__init__.py
+++ b/examples/__init__.py
--- a/examples/core/data_readers/__init__.py
+++ b/examples/core/data_readers/__init__.py
+
--- a/examples/core/data_readers/augmentation.py
+++ b/examples/core/data_readers/augmentation.py
+import torch
+import torchvision.transforms as transforms
+import numpy as np
+import torch.nn.functional as F
+
+
+class RGBDAugmentor:
+    """ perform augmentation on RGB-D video """
+
+    def __init__(self, crop_size):
+        self.crop_size = crop_size
+        self.augcolor = transforms.Compose([
+            transforms.ToPILImage(),
+            transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.4/3.14),
+            transforms.ToTensor()])
+
+        self.max_scale = 0.25
+
+    def spatial_transform(self, images, depths, poses, intrinsics):
+        """ cropping and resizing """
+        ht, wd = images.shape[2:]
+
+        max_scale = self.max_scale
+        min_scale = np.log2(np.maximum(
+            (self.crop_size[0] + 1) / float(ht),
+            (self.crop_size[1] + 1) / float(wd)))
+
+        scale = 2 ** np.random.uniform(min_scale, max_scale)
+        intrinsics = scale * intrinsics
+        depths = depths.unsqueeze(dim=1)
+
+        images = F.interpolate(images, scale_factor=scale, mode='bilinear', 
+            align_corners=True, recompute_scale_factor=True)
+        
+        depths = F.interpolate(depths, scale_factor=scale, recompute_scale_factor=True)
+
+        # always perform center crop (TODO: try non-center crops)
+        y0 = (images.shape[2] - self.crop_size[0]) // 2
+        x0 = (images.shape[3] - self.crop_size[1]) // 2
+
+        intrinsics = intrinsics - torch.tensor([0.0, 0.0, x0, y0])
+        images = images[:, :, y0:y0+self.crop_size[0], x0:x0+self.crop_size[1]]
+        depths = depths[:, :, y0:y0+self.crop_size[0], x0:x0+self.crop_size[1]]
+
+        depths = depths.squeeze(dim=1)
+        return images, poses, depths, intrinsics
+
+    def color_transform(self, images):
+        """ color jittering """
+        num, ch, ht, wd = images.shape
+        images = images.permute(1, 2, 3, 0).reshape(ch, ht, wd*num)
+        images = 255 * self.augcolor(images[[2,1,0]] / 255.0)
+        return images[[2,1,0]].reshape(ch, ht, wd, num).permute(3,0,1,2).contiguous()
+
+    def __call__(self, images, poses, depths, intrinsics):
+        images = self.color_transform(images)
+        return self.spatial_transform(images, depths, poses, intrinsics)
--- a/examples/core/data_readers/base.py
+++ b/examples/core/data_readers/base.py
+
+import numpy as np
+import torch
+import torch.utils.data as data
+import torch.nn.functional as F
+
+import csv
+import os
+import cv2
+import math
+import random
+import json
+import pickle
+import os.path as osp
+
+from .augmentation import RGBDAugmentor
+from .rgbd_utils import *
+
+class RGBDDataset(data.Dataset):
+    def __init__(self, root, name, n_frames=4, crop_size=[384,512], fmin=8.0, fmax=75.0, do_aug=True):
+        """ Base class for RGBD dataset """
+        self.aug = None
+        self.root = root
+        self.name = name
+
+        self.n_frames = n_frames
+        self.fmin = fmin # exclude very easy examples
+        self.fmax = fmax # exclude very hard examples
+        
+        if do_aug:
+            self.aug = RGBDAugmentor(crop_size=crop_size)
+
+        # building dataset is expensive, cache so only needs to be performed once
+        cur_path = osp.dirname(osp.abspath(__file__))
+        cache_path = osp.join(cur_path, 'cache', '{}.pickle'.format(self.name))
+
+        if not osp.isdir(osp.join(cur_path, 'cache')):
+            os.mkdir(osp.join(cur_path, 'cache'))
+
+        if osp.isfile(cache_path):
+            scene_info = pickle.load(open(cache_path, 'rb'))[0]
+        else:
+            scene_info = self._build_dataset()
+            with open(cache_path, 'wb') as cachefile:
+                pickle.dump((scene_info,), cachefile)
+
+        self.scene_info = scene_info
+        self._build_dataset_index()
+                
+    def _build_dataset_index(self):
+        self.dataset_index = []
+        for scene in self.scene_info:
+            if not self.__class__.is_test_scene(scene):
+                graph = self.scene_info[scene]['graph']
+                for i in graph:
+                    if len(graph[i][0]) > self.n_frames:
+                        self.dataset_index.append((scene, i))
+
+    @staticmethod
+    def image_read(image_file):
+        return cv2.imread(image_file)
+
+    @staticmethod
+    def depth_read(depth_file):
+        return np.load(depth_file)
+
+    def build_frame_graph(self, poses, depths, intrinsics, f=16, max_flow=256):
+        """ compute optical flow distance between all pairs of frames """
+        def read_disp(fn):
+            depth = self.__class__.depth_read(fn)[f//2::f, f//2::f]
+            depth[depth < 0.01] = np.mean(depth)
+            return 1.0 / depth
+
+        poses = np.array(poses)
+        intrinsics = np.array(intrinsics) / f
+        
+        disps = np.stack(list(map(read_disp, depths)), 0)
+        d = f * compute_distance_matrix_flow(poses, disps, intrinsics)
+
+        # uncomment for nice visualization
+        # import matplotlib.pyplot as plt
+        # plt.imshow(d)
+        # plt.show()
+
+        graph = {}
+        for i in range(d.shape[0]):
+            j, = np.where(d[i] < max_flow)
+            graph[i] = (j, d[i,j])
+
+        return graph
+
+    def __getitem__(self, index):
+        """ return training video """
+
+        index = index % len(self.dataset_index)
+        scene_id, ix = self.dataset_index[index]
+
+        frame_graph = self.scene_info[scene_id]['graph']
+        images_list = self.scene_info[scene_id]['images']
+        depths_list = self.scene_info[scene_id]['depths']
+        poses_list = self.scene_info[scene_id]['poses']
+        intrinsics_list = self.scene_info[scene_id]['intrinsics']
+
+        inds = [ ix ]
+        while len(inds) < self.n_frames:
+            # get other frames within flow threshold
+            k = (frame_graph[ix][1] > self.fmin) & (frame_graph[ix][1] < self.fmax)
+            frames = frame_graph[ix][0][k]
+
+            # prefer frames forward in time
+            if np.count_nonzero(frames[frames > ix]):
+                ix = np.random.choice(frames[frames > ix])
+            
+            elif np.count_nonzero(frames):
+                ix = np.random.choice(frames)
+
+            inds += [ ix ]
+
+        images, depths, poses, intrinsics = [], [], [], []
+        for i in inds:
+            images.append(self.__class__.image_read(images_list[i]))
+            depths.append(self.__class__.depth_read(depths_list[i]))
+            poses.append(poses_list[i])
+            intrinsics.append(intrinsics_list[i])
+
+        images = np.stack(images).astype(np.float32)
+        depths = np.stack(depths).astype(np.float32)
+        poses = np.stack(poses).astype(np.float32)
+        intrinsics = np.stack(intrinsics).astype(np.float32)
+
+        images = torch.from_numpy(images).float()
+        images = images.permute(0, 3, 1, 2)
+
+        depths = torch.from_numpy(depths)
+        poses = torch.from_numpy(poses)
+        intrinsics = torch.from_numpy(intrinsics)
+
+        if self.aug is not None:
+            images, poses, depths, intrinsics = \
+                self.aug(images, poses, depths, intrinsics)
+
+        return images, poses, depths, intrinsics 
+
+    def __len__(self):
+        return len(self.dataset_index)
+
+    def __imul__(self, x):
+        self.dataset_index *= x
+        return self
--- a/examples/core/data_readers/eth3d.py
+++ b/examples/core/data_readers/eth3d.py
+
+import numpy as np
+import torch
+import torch.utils.data as data
+import torch.nn.functional as F
+
+import csv
+import os
+import cv2
+import math
+import random
+import json
+import pickle
+import os.path as osp
+
+from lietorch import SE3
+from .base import RGBDDataset
+from .stream import RGBDStream
+from .augmentation import RGBDAugmentor
+from .rgbd_utils import loadtum, all_pairs_distance_matrix
+
+class ETH3D(RGBDDataset):
+    def __init__(self, **kwargs):
+        super(ETH3D, self).__init__(root='datasets/ETH3D', name='ETH3D', **kwargs)
+
+    @staticmethod 
+    def is_test_scene(scene):
+        return False
+
+    def _build_dataset(self):
+        from tqdm import tqdm
+        print("Building ETH3D dataset")
+
+        scene_info = {}
+        dataset_index = []
+
+        for scene in tqdm(os.listdir(self.root)):
+            scene_path = osp.join(self.root, scene)
+            
+            if not osp.isdir(scene_path):
+                continue
+            
+            # don't use scenes with no rgb info
+            if 'dark' in scene or 'kidnap' in scene:
+                continue
+
+            scene_data, graph = {}, {}
+            images, depths, poses, intrinsics = loadtum(scene_path, skip=2)
+
+            # graph of co-visible frames based on flow
+            graph = self.build_frame_graph(poses, depths, intrinsics)
+
+            scene_info[scene] = {'images': images, 'depths': depths, 
+                'poses': poses, 'intrinsics': intrinsics, 'graph': graph}
+
+        return scene_info
+
+    @staticmethod
+    def image_read(image_file):
+        return cv2.imread(image_file)
+
+    @staticmethod
+    def depth_read(depth_file):
+        depth = cv2.imread(depth_file, cv2.IMREAD_ANYDEPTH)
+        return depth.astype(np.float32) / 5000.0
+
+
+class ETH3DStream(RGBDStream):
+    def __init__(self, datapath, **kwargs):
+        super(ETH3DStream, self).__init__(datapath=datapath, **kwargs)
+
+    def _build_dataset_index(self):
+        """ build list of images, poses, depths, and intrinsics """
+        images, depths, poses, intrinsics = loadtum(self.datapath, self.frame_rate)
+
+        # set first pose to identity
+        poses = SE3(torch.as_tensor(poses))
+        poses = poses[[0]].inv() * poses
+        poses = poses.data.cpu().numpy()
+
+        self.images = images
+        self.poses = poses
+        self.depths = depths
+        self.intrinsics = intrinsics
+
+    @staticmethod
+    def image_read(image_file):
+        return cv2.imread(image_file)
+
+    @staticmethod
+    def depth_read(depth_file):
+        depth = cv2.imread(depth_file, cv2.IMREAD_ANYDEPTH)
+        return depth.astype(np.float32) / 5000.0
--- a/examples/core/data_readers/factory.py
+++ b/examples/core/data_readers/factory.py
+
+import pickle
+import os
+import os.path as osp
+
+# RGBD-Dataset
+from .tartan import TartanAir
+from .nyu2 import NYUv2
+from .eth3d import ETH3D
+from .scannet import ScanNet
+
+# streaming datasets for inference
+from .eth3d import ETH3DStream
+from .tum import TUMStream
+from .tartan import TartanAirStream
+
+
+def dataset_factory(dataset_list, **kwargs):
+    """ create a combined dataset """
+
+    from torch.utils.data import ConcatDataset
+
+    dataset_map = {
+        'tartan': (TartanAir, 1),
+        'nyu': (NYUv2, 2),
+        'eth': (ETH3D, 5),
+        'scannet': (ScanNet, 1)}
+
+    db_list = []
+    for key in dataset_list:
+        # cache datasets for faster future loading
+        db = dataset_map[key][0](**kwargs)
+        db *= dataset_map[key][1]
+
+        print("Dataset {} has {} images".format(key, len(db)))
+        db_list.append(db)
+
+    return ConcatDataset(db_list)
+            
+
+def create_datastream(dataset_path, **kwargs):
+    """ create data_loader to stream images 1 by 1 """
+
+    from torch.utils.data import DataLoader
+
+    if osp.isfile(osp.join(dataset_path, 'calibration.txt')):
+        db = ETH3DStream(dataset_path, **kwargs)
+
+    elif osp.isfile(osp.join(dataset_path, 'rgb.txt')):
+        db = TUMStream(dataset_path, **kwargs)
+    
+    elif osp.isdir(osp.join(dataset_path, 'image_left')):
+        db = TartanStream(dataset_path, **kwargs)
+
+    stream = DataLoader(db, shuffle=False, batch_size=1, num_workers=4)
+    return stream
+
+
+
--- a/examples/core/data_readers/nyu2.py
+++ b/examples/core/data_readers/nyu2.py
+
+import numpy as np
+import torch
+import glob
+import cv2
+import os
+import os.path as osp
+
+from .base import RGBDDataset
+from .augmentation import RGBDAugmentor
+from .rgbd_utils import all_pairs_distance_matrix, loadtum
+
+class NYUv2(RGBDDataset):
+    def __init__(self, **kwargs):
+        super(NYUv2, self).__init__(root='datasets/NYUv2', name='NYUv2', **kwargs)
+
+    @staticmethod 
+    def is_test_scene(scene):
+        return False
+
+    def _build_dataset(self):
+
+        from tqdm import tqdm
+        print("Building NYUv2 dataset")
+
+        scene_info = {}
+        dataset_index = []
+
+        scenes = os.listdir(self.root)
+        for scene in tqdm(scenes):
+            scene_path = osp.join(self.root, scene)
+            images, depths, poses, intrinsics = loadtum(scene_path, frame_rate=10)
+
+            # filter out some errors in dataset
+            if images is None or len(images) < 8:
+                continue
+
+            intrinsic = NYUv2.calib_read()
+            intrinsics = [intrinsic] * len(images)
+
+            # graph of co-visible frames based on flow
+            graph = self.build_frame_graph(poses, depths, intrinsics)
+
+            scene_info[scene] = {'images': images, 'depths': depths, 
+                'poses': poses, 'intrinsics': intrinsics, 'graph': graph}
+
+        return scene_info
+
+    @staticmethod
+    def calib_read():
+        fx = 5.1885790117450188e+02
+        fy = 5.1946961112127485e+02
+        cx = 3.2558244941119034e+02
+        cy = 2.5373616633400465e+02
+        return np.array([fx, fy, cx, cy])
+
+    @staticmethod
+    def image_read(image_file):
+        return cv2.imread(image_file)
+
+    @staticmethod
+    def depth_read(depth_file):
+        depth = cv2.imread(depth_file, cv2.IMREAD_ANYDEPTH)
+        return depth.astype(np.float32) / 5000.0
+
--- a/examples/core/data_readers/rgbd_utils.py
+++ b/examples/core/data_readers/rgbd_utils.py
+import numpy as np
+import os.path as osp
+
+import torch
+from lietorch import SE3
+
+import geom.projective_ops as pops
+from scipy.spatial.transform import Rotation
+
+
+def parse_list(filepath, skiprows=0):
+    """ read list data """
+    data = np.loadtxt(filepath, delimiter=' ', dtype=np.unicode_, skiprows=skiprows)
+    return data
+
+def associate_frames(tstamp_image, tstamp_depth, tstamp_pose, max_dt=0.08):
+    """ pair images, depths, and poses """
+    associations = []
+    for i, t in enumerate(tstamp_image):
+        if tstamp_pose is None:
+            j = np.argmin(np.abs(tstamp_depth - t))
+            if (np.abs(tstamp_depth[j] - t) < max_dt):
+                associations.append((i, j))
+
+        else:
+            j = np.argmin(np.abs(tstamp_depth - t))
+            k = np.argmin(np.abs(tstamp_pose - t))
+        
+            if (np.abs(tstamp_depth[j] - t) < max_dt) and \
+                    (np.abs(tstamp_pose[k] - t) < max_dt):
+                associations.append((i, j, k))
+            
+    return associations
+
+def loadtum(datapath, frame_rate=-1):
+    """ read video data in tum-rgbd format """
+    if osp.isfile(osp.join(datapath, 'groundtruth.txt')):
+        pose_list = osp.join(datapath, 'groundtruth.txt')
+    elif osp.isfile(osp.join(datapath, 'pose.txt')):
+        pose_list = osp.join(datapath, 'pose.txt')
+
+    image_list = osp.join(datapath, 'rgb.txt')
+    depth_list = osp.join(datapath, 'depth.txt')
+
+    calib_path = osp.join(datapath, 'calibration.txt')
+    intrinsic = None
+    if osp.isfile(calib_path):
+        intrinsic = np.loadtxt(calib_path, delimiter=' ')
+        intrinsic = intrinsic.astype(np.float64)
+
+    image_data = parse_list(image_list)
+    depth_data = parse_list(depth_list)
+    pose_data = parse_list(pose_list, skiprows=1)
+    pose_vecs = pose_data[:,1:].astype(np.float64)
+
+    tstamp_image = image_data[:,0].astype(np.float64)
+    tstamp_depth = depth_data[:,0].astype(np.float64)
+    tstamp_pose = pose_data[:,0].astype(np.float64)
+    associations = associate_frames(tstamp_image, tstamp_depth, tstamp_pose)
+
+    indicies = [ 0 ]
+    for i in range(1, len(associations)):
+        t0 = tstamp_image[associations[indicies[-1]][0]]
+        t1 = tstamp_image[associations[i][0]]
+        if t1 - t0 > 1.0 / frame_rate:
+            indicies += [ i ]
+
+    images, poses, depths, intrinsics = [], [], [], []
+    for ix in indicies:
+        (i, j, k) = associations[ix]
+        images += [ osp.join(datapath, image_data[i,1]) ]
+        depths += [ osp.join(datapath, depth_data[j,1]) ]
+        poses += [ pose_vecs[k] ]
+        
+        if intrinsic is not None:
+            intrinsics += [ intrinsic ]
+
+    return images, depths, poses, intrinsics
+
+
+def all_pairs_distance_matrix(poses, beta=2.5):
+    """ compute distance matrix between all pairs of poses """
+    poses = np.array(poses, dtype=np.float32)
+    poses[:,:3] *= beta # scale to balence rot + trans
+    poses = SE3(torch.from_numpy(poses))
+
+    r = (poses[:,None].inv() * poses[None,:]).log()
+    return r.norm(dim=-1).cpu().numpy()
+
+def pose_matrix_to_quaternion(pose):
+    """ convert 4x4 pose matrix to (t, q) """
+    q = Rotation.from_matrix(pose[:3, :3]).as_quat()
+    return np.concatenate([pose[:3, 3], q], axis=0)
+
+def compute_distance_matrix_flow(poses, disps, intrinsics):
+    """ compute flow magnitude between all pairs of frames """
+    if not isinstance(poses, SE3):
+        poses = torch.from_numpy(poses).float().cuda()[None]
+        poses = SE3(poses).inv()
+
+        disps = torch.from_numpy(disps).float().cuda()[None]
+        intrinsics = torch.from_numpy(intrinsics).float().cuda()[None]
+
+    N = poses.shape[1]
+    
+    ii, jj = torch.meshgrid(torch.arange(N), torch.arange(N))
+    ii = ii.reshape(-1).cuda()
+    jj = jj.reshape(-1).cuda()
+
+    MAX_FLOW = 100.0
+    matrix = np.zeros((N, N), dtype=np.float32)
+
+    s = 2048
+    for i in range(0, ii.shape[0], s):
+        flow1, val1 = pops.induced_flow(poses, disps, intrinsics, ii[i:i+s], jj[i:i+s])
+        flow2, val2 = pops.induced_flow(poses, disps, intrinsics, jj[i:i+s], ii[i:i+s])
+        
+        flow = torch.stack([flow1, flow2], dim=2)
+        val = torch.stack([val1, val2], dim=2)
+        
+        mag = flow.norm(dim=-1).clamp(max=MAX_FLOW)
+        mag = mag.view(mag.shape[1], -1)
+        val = val.view(val.shape[1], -1)
+
+        mag = (mag * val).mean(-1) / val.mean(-1)
+        mag[val.mean(-1) < 0.7] = np.inf
+
+        i1 = ii[i:i+s].cpu().numpy()
+        j1 = jj[i:i+s].cpu().numpy()
+        matrix[i1, j1] = mag.cpu().numpy()
+
+    return matrix
--- a/examples/core/data_readers/scannet.py
+++ b/examples/core/data_readers/scannet.py
+
+import numpy as np
+import pickle
+import torch
+import glob
+import cv2
+import os
+import os.path as osp
+
+from .base import RGBDDataset
+from .augmentation import RGBDAugmentor
+from .rgbd_utils import pose_matrix_to_quaternion
+from .rgbd_utils import  all_pairs_distance_matrix
+
+
+class ScanNet(RGBDDataset):
+    def __init__(self, **kwargs):
+        super(ScanNet, self).__init__(root='datasets/ScanNet', name='ScanNet', **kwargs)
+
+    @staticmethod
+    def is_test_scene(scene):
+        scanid = int(re.findall(r'scene(.+?)_', scene)[0])
+        return scanid > 660
+
+    def _build_dataset_index(self):
+        """ construct scene_info and dataset_index objects """
+
+        from tqdm import tqdm
+        print("Building ScanNet dataset")
+
+        scene_info = {}
+        dataset_index = []
+
+        for scene in tqdm(os.listdir(self.root)):
+            scene_path = osp.join(self.root, scene)
+            depth_glob = osp.join(scene_path, 'depth', '*.png')
+            depth_list = glob.glob(depth_glob)
+
+            get_indicies = lambda x: int(osp.basename(x).split('.')[0])
+            get_images = lambda i: osp.join(scene_path, 'color', '%d.jpg' % i)
+            get_depths = lambda i: osp.join(scene_path, 'depth', '%d.png' % i)
+            get_poses = lambda i: osp.join(scene_path, 'pose', '%d.txt' % i)
+
+            indicies = sorted(map(get_indicies, depth_list))[::2]
+            image_list = list(map(get_images, indicies))
+            depth_list = list(map(get_depths, indicies))
+
+            pose_list = map(get_poses, indicies)
+            pose_list = list(map(ScanNet.pose_read, pose_list))
+
+            # remove nan poses
+            pvecs = np.stack(pose_list, 0)
+            keep, = np.where(~np.any(np.isnan(pvecs) | np.isinf(pvecs), axis=1))
+            images = [image_list[i] for i in keep]
+            depths = [depth_list[i] for i in keep]
+            poses = [pose_list[i] for i in keep]
+
+            intrinsic = ScanNet.calib_read(scene_path)
+            intrinsics = [intrinsic] * len(images)
+
+            graph = self.build_frame_graph(poses, depths, intrinsics)
+
+            scene_info[scene] = {'images': images, 'depths': depths, 
+                'poses': poses, 'intrinsics': intrinsics, 'graph': graph}
+
+            for i in range(len(images)):
+                if len(graph[i][0]) > 1:
+                    dataset_index.append((scene, i))
+
+        return scene_info, dataset_index
+            
+    @staticmethod
+    def calib_read(scene_path):
+        intrinsic_file = osp.join(scene_path, 'intrinsic', 'intrinsic_depth.txt')
+        K = np.loadtxt(intrinsic_file, delimiter=' ')
+        return np.array([K[0,0], K[1,1], K[0,2], K[1,2]])
+
+    @staticmethod
+    def pose_read(pose_file):
+        pose = np.loadtxt(pose_file, delimiter=' ').astype(np.float64)
+        return pose_matrix_to_quaternion(pose)
+
+    @staticmethod
+    def image_read(image_file):
+        image = cv2.imread(image_file)
+        return cv2.resize(image, (640, 480))
+
+    @staticmethod
+    def depth_read(depth_file):
+        depth = cv2.imread(depth_file, cv2.IMREAD_ANYDEPTH)
+        return depth.astype(np.float32) / 1000.0
\ No newline at end of file
--- a/examples/core/data_readers/stream.py
+++ b/examples/core/data_readers/stream.py
+
+import numpy as np
+import torch
+import torch.utils.data as data
+import torch.nn.functional as F
+
+import csv
+import os
+import cv2
+import math
+import random
+import json
+import pickle
+import os.path as osp
+
+from .rgbd_utils import *
+
+class RGBDStream(data.Dataset):
+    def __init__(self, datapath, frame_rate=-1, crop_size=[384,512]):
+        self.datapath = datapath
+        self.frame_rate = frame_rate
+        self.crop_size = crop_size
+        self._build_dataset_index()
+    
+    @staticmethod
+    def image_read(image_file):
+        return cv2.imread(image_file)
+
+    @staticmethod
+    def depth_read(depth_file):
+        return np.load(depth_file)
+
+    def __len__(self):
+        return len(self.images)
+
+    def __getitem__(self, index):
+        """ return training video """
+        image = self.__class__.image_read(self.images[index])
+        image = torch.from_numpy(image).float()
+        image = image.permute(2, 0, 1)
+
+        depth = self.__class__.depth_read(self.depths[index])
+        depth = torch.from_numpy(depth).float()
+
+        pose = torch.from_numpy(self.poses[index]).float()
+        intrinsic = torch.from_numpy(self.intrinsics[index]).float()
+
+        sx = self.crop_size[1] / depth.shape[1]
+        sy = self.crop_size[0] / depth.shape[0]
+        image = F.interpolate(image[None], self.crop_size, mode='bilinear', align_corners=True)[0]
+        depth = F.interpolate(depth[None,None], self.crop_size, mode='nearest')[0,0]
+
+        image = image[..., 8:-8, 8:-8]
+        depth = depth[..., 8:-8, 8:-8]
+
+        fx, fy, cx, cy = intrinsic.unbind(dim=0)
+        intrinsic = torch.stack([sx*fx, sy*fy, sx*cx - 8, sy*cy - 8])
+
+        # intrinsic *= torch.as_tensor([sx, sy, sx, sy])
+        return index, image, depth, pose, intrinsic
+
+
+
+                
\ No newline at end of file
--- a/examples/core/data_readers/tartan.py
+++ b/examples/core/data_readers/tartan.py
+
+import numpy as np
+import torch
+import glob
+import cv2
+import os
+import os.path as osp
+
+from .base import RGBDDataset
+from .stream import RGBDStream
+
+class TartanAir(RGBDDataset):
+
+    # scale depths to balance rot & trans
+    DEPTH_SCALE = 5.0
+    TEST_SET = ['westerndesert', 'seasidetown', 'seasonsforest_winter', 'office2', 'gascola']
+
+    def __init__(self, mode='training', **kwargs):
+        self.mode = mode
+        self.n_frames = 2
+        super(TartanAir, self).__init__(root='datasets/TartanAir', name='TartanAir', **kwargs)
+
+    @staticmethod
+    def is_test_scene(scene):
+        return scene.split('/')[-3] in TartanAir.TEST_SET
+
+    def _build_dataset(self):
+        from tqdm import tqdm
+        print("Building TartanAir dataset")
+
+        scene_info = {}
+        scenes = glob.glob(osp.join(self.root, '*/*/*/*'))
+        for scene in tqdm(sorted(scenes)):
+            images = sorted(glob.glob(osp.join(scene, 'image_left/*.png')))
+            depths = sorted(glob.glob(osp.join(scene, 'depth_left/*.npy')))
+            
+            poses = np.loadtxt(osp.join(scene, 'pose_left.txt'), delimiter=' ')
+            poses = poses[:, [1, 2, 0, 4, 5, 3, 6]]
+            poses[:,:3] /= TartanAir.DEPTH_SCALE
+            intrinsics = [TartanAir.calib_read()] * len(images)
+
+            # graph of co-visible frames based on flow
+            graph = self.build_frame_graph(poses, depths, intrinsics)
+
+            scene = '/'.join(scene.split('/'))
+            scene_info[scene] = {'images': images, 'depths': depths, 
+                'poses': poses, 'intrinsics': intrinsics, 'graph': graph}
+
+        return scene_info
+
+    @staticmethod
+    def calib_read():
+        return np.array([320.0, 320.0, 320.0, 240.0])
+
+    @staticmethod
+    def image_read(image_file):
+        return cv2.imread(image_file)
+
+    @staticmethod
+    def depth_read(depth_file):
+        depth = np.load(depth_file) / TartanAir.DEPTH_SCALE
+        depth[depth==np.nan] = 1.0
+        depth[depth==np.inf] = 1.0
+        return depth
+
+
+class TartanAirTest(torch.utils.data.Dataset):
+    def __init__(self, root='datasets/Tartan'):
+        self.root = root
+        self.dataset_index = []
+
+        self.scene_info = {}
+        scenes = glob.glob(osp.join(self.root, '*/*/*/*'))
+        
+        for scene in sorted(scenes):
+            image_glob = osp.join(scene, 'image_left/*.png')
+            depth_glob = osp.join(scene, 'depth_left/*.npy')            
+            images = sorted(glob.glob(image_glob))
+            depths = sorted(glob.glob(depth_glob))
+
+            poses = np.loadtxt(osp.join(scene, 'pose_left.txt'), delimiter=' ')
+            poses = poses[:, [1, 2, 0, 4, 5, 3, 6]]
+            poses[:,:3] /= TartanAir.DEPTH_SCALE
+            intrinsics = [TartanAir.calib_read()] * len(images)
+            
+            self.scene_info[scene] = {'images': images, 
+                'depths': depths, 'poses': poses, 'intrinsics': intrinsics}
+        
+        with open('assets/tartan_test.txt') as f:
+            self.dataset_index = f.readlines()
+        
+    def __getitem__(self, index):
+        """ load test example """
+
+        scene_id, ix1, ix2 = self.dataset_index[index].split()
+        inds = [int(ix1), int(ix2)]
+
+        images_list = self.scene_info[scene_id]['images']
+        depths_list = self.scene_info[scene_id]['depths']
+        poses_list = self.scene_info[scene_id]['poses']
+        intrinsics_list = self.scene_info[scene_id]['intrinsics']
+
+        images, depths, poses, intrinsics = [], [], [], []
+        for i in inds:
+            images.append(TartanAir.image_read(images_list[i]))
+            depths.append(TartanAir.depth_read(depths_list[i]))
+            poses.append(poses_list[i])
+            intrinsics.append(intrinsics_list[i])
+
+        images = np.stack(images).astype(np.float32)
+        depths = np.stack(depths).astype(np.float32)
+        poses = np.stack(poses).astype(np.float32)
+        intrinsics = np.stack(intrinsics).astype(np.float32)
+
+        images = torch.from_numpy(images).float()
+        images = images.permute(0, 3, 1, 2)
+
+        depths = torch.from_numpy(depths)
+        poses = torch.from_numpy(poses)
+        intrinsics = torch.from_numpy(intrinsics)
+
+        return images, poses, depths, intrinsics 
+
+    def __len__(self):
+        return len(self.dataset_index)
+
+class TartanAirStream(RGBDStream):
+    def __init__(self, datapath, **kwargs):
+        super(TartanAirStream, self).__init__(datapath=datapath, **kwargs)
+
+    def _build_dataset_index(self):
+        """ build list of images, poses, depths, and intrinsics """
+        images, poses, depths, intrinsics = loadtum(self.datapath)
+        intrinsic = NYUv2.TUMStream(self.datapath)
+        intrinsics = np.tile(intrinsic[None], (len(images), 1))
+
+        self.images = images
+        self.poses = poses
+        self.depths = depths
+        self.intrinsics = intrinsics
+
+    @staticmethod
+    def calib_read(datapath):
+        return np.array([320.0, 320.0, 320.0, 240.0])
+
+    @staticmethod
+    def image_read(image_file):
+        return cv2.imread(image_file)
+
+    @staticmethod
+    def depth_read(depth_file):
+        depth = cv2.imread(depth_file, cv2.IMREAD_ANYDEPTH)
+        return depth.astype(np.float32) / 5000.0
--- a/examples/core/data_readers/tum.py
+++ b/examples/core/data_readers/tum.py
+
+import numpy as np
+import torch
+
+import csv
+import os
+import cv2
+import math
+import random
+import json
+import pickle
+import os.path as osp
+
+from lietorch import SE3
+from .stream import RGBDStream
+from .rgbd_utils import loadtum
+
+intrinsics_dict = {
+    'freiburg1': [517.3, 516.5, 318.6, 255.3],
+    'freiburg2': [520.9, 521.0, 325.1, 249.7],
+    'freiburg3': [535.4, 539.2, 320.1, 247.6], 
+}
+
+distortion_dict = {
+    'freiburg1': [0.2624, -0.9531, -0.0054, 0.0026, 1.1633],
+    'freiburg2': [0.2312, -0.7849, -0.0033, -0.0001, 0.9172],
+    'freiburg3': [0, 0, 0, 0, 0],
+}
+
+def as_intrinsics_matrix(intrinsics):
+    K = np.eye(3)
+    K[0,0] = intrinsics[0]
+    K[1,1] = intrinsics[1]
+    K[0,2] = intrinsics[2]
+    K[1,2] = intrinsics[3]
+    return K
+
+
+class TUMStream(RGBDStream):
+    def __init__(self, datapath, **kwargs):
+        super(TUMStream, self).__init__(datapath=datapath, **kwargs)
+
+    def _build_dataset_index(self):
+        """ build list of images, poses, depths, and intrinsics """
+        images, depths, poses, intrinsics = loadtum(self.datapath, self.frame_rate)
+        intrinsic, _ = TUMStream.calib_read(self.datapath)
+        intrinsics = np.tile(intrinsic[None], (len(images), 1))
+
+        # set first pose to identity
+        poses = SE3(torch.as_tensor(poses))
+        poses = poses[[0]].inv() * poses
+        poses = poses.data.cpu().numpy()
+
+        self.images = images
+        self.poses = poses
+        self.depths = depths
+        self.intrinsics = intrinsics
+
+    @staticmethod
+    def calib_read(datapath):
+        if 'freiburg1' in datapath:
+            intrinsic = intrinsics_dict['freiburg1']
+            d_coef = distortion_dict['freiburg1']
+
+        elif 'freiburg2' in datapath:
+            intrinsic = intrinsics_dict['freiburg2']
+            d_coef = distortion_dict['freiburg2']
+
+        elif 'freiburg3' in datapath:
+            intrinsic = intrinsics_dict['freiburg3']
+            d_coef = distortion_dict['freiburg3']
+
+        return np.array(intrinsic), np.array(d_coef)
+
+    @staticmethod
+    def image_read(image_file):
+        intrinsics, d_coef = TUMStream.calib_read(image_file)
+        K = as_intrinsics_matrix(intrinsics)
+        image = cv2.imread(image_file)
+        return cv2.undistort(image, K, d_coef)
+
+    @staticmethod
+    def depth_read(depth_file):
+        depth = cv2.imread(depth_file, cv2.IMREAD_ANYDEPTH)
+        return depth.astype(np.float32) / 5000.0
--- a/examples/core/geom/__init__.py
+++ b/examples/core/geom/__init__.py
--- a/examples/core/geom/ba.py
+++ b/examples/core/geom/ba.py
+import lietorch
+import torch
+import torch.nn.functional as F
+
+from .chol import block_solve
+import geom.projective_ops as pops
+
+# utility functions for scattering ops
+def safe_scatter_add_mat(H, data, ii, jj, B, M, D):
+    v = (ii >= 0) & (jj >= 0)
+    H.scatter_add_(1, (ii[v]*M + jj[v]).view(1,-1,1,1).repeat(B,1,D,D), data[:,v])
+
+def safe_scatter_add_vec(b, data, ii, B, M, D):
+    v = ii >= 0
+    b.scatter_add_(1, ii[v].view(1,-1,1).repeat(B,1,D), data[:,v])
+
+def MoBA(target, weight, poses, disps, intrinsics, ii, jj, fixedp=1, lm=0.0001, ep=0.1):
+    """ MoBA: Motion Only Bundle Adjustment """
+
+    B, M = poses.shape[:2]
+    D = poses.manifold_dim
+    N = ii.shape[0]
+
+    ### 1: commpute jacobians and residuals ###
+    coords, valid, (Ji, Jj) = pops.projective_transform(
+        poses, disps, intrinsics, ii, jj, jacobian=True)
+
+    r = (target - coords).view(B, N, -1, 1)
+    w = (valid * weight).view(B, N, -1, 1)
+
+    ### 2: construct linear system ###
+    Ji = Ji.view(B, N, -1, D)
+    Jj = Jj.view(B, N, -1, D)
+    wJiT = (.001 * w * Ji).transpose(2,3)
+    wJjT = (.001 * w * Jj).transpose(2,3)
+
+    Hii = torch.matmul(wJiT, Ji)
+    Hij = torch.matmul(wJiT, Jj)
+    Hji = torch.matmul(wJjT, Ji)
+    Hjj = torch.matmul(wJjT, Jj)
+
+    vi = torch.matmul(wJiT, r).squeeze(-1)
+    vj = torch.matmul(wJjT, r).squeeze(-1)
+
+    # only optimize keyframe poses
+    M = M - fixedp
+    ii = ii - fixedp
+    jj = jj - fixedp
+
+    H = torch.zeros(B, M*M, D, D, device=target.device)
+    safe_scatter_add_mat(H, Hii, ii, ii, B, M, D)
+    safe_scatter_add_mat(H, Hij, ii, jj, B, M, D)
+    safe_scatter_add_mat(H, Hji, jj, ii, B, M, D)
+    safe_scatter_add_mat(H, Hjj, jj, jj, B, M, D)
+    H = H.reshape(B, M, M, D, D)
+
+    v = torch.zeros(B, M, D, device=target.device)
+    safe_scatter_add_vec(v, vi, ii, B, M, D)
+    safe_scatter_add_vec(v, vj, jj, B, M, D)
+
+    ### 3: solve the system + apply retraction ###
+    dx = block_solve(H, v, ep=ep, lm=lm)
+    
+    poses1, poses2 = poses[:,:fixedp], poses[:,fixedp:]
+    poses2 = poses2.retr(dx)
+    
+    poses = lietorch.cat([poses1, poses2], dim=1)
+    return poses
+
+def SLessBA(target, weight, poses, disps, intrinsics, ii, jj, fixedp=1):
+    """ Structureless Bundle Adjustment """
+    pass
+
+
+def BA(target, weight, poses, disps, intrinsics, ii, jj, fixedp=1):
+    """ Full Bundle Adjustment """
+    pass
\ No newline at end of file
--- a/examples/core/geom/chol.py
+++ b/examples/core/geom/chol.py
+import torch
+import torch.nn.functional as F
+import geom.projective_ops as pops
+
+class CholeskySolver(torch.autograd.Function):
+    @staticmethod
+    def forward(ctx, H, b):
+        # don't crash training if cholesky decomp fails
+        try:
+            U = torch.cholesky(H)
+            xs = torch.cholesky_solve(b, U)
+            ctx.save_for_backward(U, xs)
+            ctx.failed = False
+        except Exception as e:
+            print(e)
+            ctx.failed = True
+            xs = torch.zeros_like(b)
+
+        return xs
+
+    @staticmethod
+    def backward(ctx, grad_x):
+        if ctx.failed:
+            return None, None
+
+        U, xs = ctx.saved_tensors
+        dz = torch.cholesky_solve(grad_x, U)
+        dH = -torch.matmul(xs, dz.transpose(-1,-2))
+
+        return dH, dz
+
+def block_solve(H, b, ep=0.1, lm=0.0001):
+    """ solve normal equations """
+    B, N, _, D, _ = H.shape
+    I = torch.eye(D).to(H.device)
+    H = H + (ep + lm*H) * I
+
+    H = H.permute(0,1,3,2,4)
+    H = H.reshape(B, N*D, N*D)
+    b = b.reshape(B, N*D, 1)
+
+    x = CholeskySolver.apply(H,b)
+    return x.reshape(B, N, D)
\ No newline at end of file