# Sparse Structure Visualization Guide

This guide explains how to use sparse structure visualization features added to TRELLIS.2.

## Overview

The sparse structure is a 3D voxel grid that represents which parts of the 3D space are occupied by the object being generated. Visualizing this helps you understand:

- The initial "skeleton" or blueprint of the 3D object
- How different pipeline types (512, 1024_cascade, 1536_cascade) affect the sparse structure
- The distribution and density of occupied voxels
- The upsampling process in cascade modes (from LR to HR coordinates)
- Potential issues in the generation process

## Two Stages of Visualization

### Stage 1: Initial Sparse Structure
Generated by [`sample_sparse_structure()`](trellis2/pipelines/trellis2_image_to_3d.py:189) - this is the initial coarse voxel grid.

### Stage 2: High-Resolution Coordinates (Cascade Modes Only)
Generated by [`sample_shape_slat_cascade()`](trellis2/pipelines/trellis2_image_to_3d.py:280) - these are the upsampled coordinates after the decoder upsamples the sparse latent 4x.

**Note:** HR coordinates visualization is only available for cascade pipeline types (`1024_cascade` and `1536_cascade`).

### Stage 3: Quantized Coordinates (Cascade Modes Only)
Generated after the resolution adjustment loop in [`sample_shape_slat_cascade()`](trellis2/pipelines/trellis2_image_to_3d.py:412) - these are the coordinates after quantization, deduplication, and adaptive resolution adjustment.

**What this shows:**
- The final coordinate grid used for shape generation
- How many tokens after adaptive resolution reduction
- The actual spatial resolution being used (may be less than target)

### Stage 4: Final SLat Features (Cascade Modes Only)
Generated after flow model sampling and denormalization in [`sample_shape_slat_cascade()`](trellis2/pipelines/trellis2_image_to_3d.py:450) - these are the learned features at each coordinate.

**What this shows:**
- The actual learned shape features
- Feature value distributions across the object
- Quality of the generated shape representation

**Note:** SLat features visualization is only available for cascade pipeline types.

### Stage 5: Texture Features (Cascade Modes Only)
Generated during texture sampling in [`sample_tex_slat()`](trellis2/pipelines/trellis2_image_to_3d.py:567) - these are the learned texture attributes at each coordinate.

**What this shows:**
- Learned texture features (e.g., RGB colors, roughness, metallic properties)
- How texture varies across spatial locations
- Feature value distributions for each texture channel

**Note:** Texture features typically have multiple dimensions (e.g., 3 for RGB textures).

## Understanding the Visualizations

### What You're Seeing

The sparse structure coordinates have shape `[N, 4]` where:
- **Column 0**: Batch index (always 0 for single samples)
- **Column 1**: X coordinate (0 to resolution-1)
- **Column 2**: Y coordinate (0 to resolution-1)
- **Column 3**: Z coordinate (0 to resolution-1)

### Initial Sparse Structure vs. HR Coordinates

When using cascade modes, you'll see two sets of visualizations:

1. **Initial Sparse Structure** (e.g., `sparse_structure_1024_cascade_seed42_*.png`)
   - Coarse 32³ voxel grid
   - ~5,000 - 15,000 occupied voxels
   - Generated directly from the sparse structure flow model

2. **HR Coordinates** (e.g., `hr_coords_1024_upsampled_*.png`)
   - Upsampled coordinates (4x denser)
   - ~20,000 - 60,000 coordinates
   - Generated by the decoder upsampling the shape SLat
   - Shows the refined structure before final shape generation

**Key Insight:** Comparing these two visualizations shows how the upsampling process refines the initial sparse structure into a more detailed representation.

## Available Visualization Methods

### 1. Matplotlib 3D Scatter Plot (`visualize_sparse_structure_matplotlib`)

Shows the sparse structure as a 3D scatter plot with color-coded Z coordinates.

**Best for:** Understanding the overall 3D shape and spatial distribution.

**Output:** Interactive 3D plot (or saved PNG file)

### 2. Voxel Grid Visualization (`visualize_sparse_structure_voxel`)

Displays the sparse structure as a 3D voxel grid where each occupied voxel is shown as a point.

**Best for:** Seeing the actual voxel structure and understanding resolution effects.

**Output:** 3D voxel visualization (or saved PNG file)

### 3. 2D Projections (`visualize_sparse_structure_projections`)

Shows three orthogonal 2D projections:
- **XY Projection**: Top view (looking down Z axis)
- **XZ Projection**: Side view (looking down Y axis)
- **YZ Projection**: Front view (looking down X axis)

**Best for:** Quick analysis of shape from different angles.

**Output:** Three 2D scatter plots in one figure (or saved PNG file)

### 4. Multi-View Visualization (`visualize_sparse_structure_multi_view`)

Combines 3D scatter plot with 2D projections in a single figure.

**Best for:** Comprehensive overview of sparse structure.

**Output:** Combined 3D + 2D visualization (or saved PNG file)

### 5. Statistical Analysis (`analyze_sparse_structure`)

Prints numerical statistics about the sparse structure:
- Total number of occupied voxels
- Coordinate ranges (X, Y, Z)
- Center position
- Standard deviation
- Bounding box volume

**Best for:** Quick quantitative analysis without visualization.

**Output:** Console output with statistics

### 6. SLat Features Visualization (`visualize_slat_features`)

Shows learned features in the shape Structured Latent (SLat) as a 3D scatter plot.

**Best for:** Understanding what the model has learned at each spatial location.

**Output:** 3D plot colored by feature values (or saved PNG file)

**Parameters:**
- `feature_idx`: Which feature dimension to visualize (default: 0)
- Multiple features can be visualized by calling with different indices

**Note:** Texture features have multiple dimensions (e.g., 3 for RGB), each representing learned texture attributes at each coordinate.

### 7. Texture Features Analysis (`analyze_slat_features`)

Prints numerical statistics about SLat features:
- Number of tokens (coordinates)
- Feature dimensions
- Statistics for each feature (min, max, mean, std)
- NaN/Inf value checks
- Coordinate ranges

**Best for:** Debugging feature values and checking for anomalies.

**Output:** Console output with feature statistics

## Quick Start with example_visualization.py

The repo includes [example_visualization.py](example_visualization.py), a standalone script that runs the full pipeline, saves stage visualizations, renders multiple views, and exports a raw `.obj` file — all in one shot. It is the fastest way to verify the pipeline is working correctly on your hardware.

### What it does

1. Runs the pipeline on a test image with all visualization stages enabled
2. Exports a raw `.obj` file (no renderer, no nvdiffrast) — load this in Blender to verify geometry completeness independently of the renderer
3. Renders N views using the same `render_snapshot` path as `app.py` and saves contact sheets

### Configuration

Edit the constants at the top of the file:

```python
IMAGE_PATH   = "assets/example_image/T2.png"   # input image
PIPELINE     = "1024_cascade"                   # '512' | '1024' | '1024_cascade' | '1536_cascade'
SEED         = 42
NVIEWS       = 8                                # render views
RENDER_RES   = 1024                             # render resolution
VIZ_DIR      = "visualizations_render_test"     # output directory
```

### Running

```sh
export FLASH_ATTENTION_TRITON_AMD_ENABLE="TRUE"   # AMD only
python example_visualization.py
```

Output files in `VIZ_DIR/`:
- `raw_mesh.obj` — raw geometry, no renderer involved. If this looks correct in Blender but renders look wrong, the bug is in the rasterizer path, not the pipeline.
- `render_frames/contact_shaded_all_views.png` — all rendered views side by side
- `render_frames/contact_normal_all_views.png` — surface normals
- `render_frames/contact_base_color_all_views.png` — albedo without lighting
- Per-stage visualization PNGs (sparse structure, HR coords, SLat features, etc.)

### Diagnosing issues

The `.obj` export is intentionally renderer-free. If the `.obj` geometry is complete but render images show only 15–30% coverage, the issue is in the nvdiffrast/rasterizer path. If the `.obj` itself looks wrong, the issue is earlier in the pipeline.


## Usage

### Basic Usage in Your Code

```python
from trellis2.pipelines import Trellis2ImageTo3DPipeline
from PIL import Image

# Load pipeline
pipeline = Trellis2ImageTo3DPipeline.from_pretrained("microsoft/TRELLIS.2-4B")
pipeline.cuda()

# Load image
image = Image.open("path/to/image.png")

# Run with visualization
mesh = pipeline.run(
    image,
    seed=42,
    pipeline_type='1024_cascade',
    visualize_sparse_structure=True,  # Enable visualization
    visualize_save_dir=None,          # None = interactive display
)
```

### Saving Visualizations to Disk

```python
mesh = pipeline.run(
    image,
    seed=42,
    pipeline_type='1024_cascade',
    visualize_sparse_structure=True,
    visualize_save_dir='my_visualizations',  # Save to directory
)
```

This will create multiple files for each visualization stage.

### Statistical Analysis Only

```python
# Generate sparse structure
coords = pipeline.sample_sparse_structure(
    pipeline.get_cond([image], 512),
    resolution=32,
    num_samples=1,
    sampler_params={}
)

# Analyze without visualization
pipeline.analyze_sparse_structure(coords)
```

Output:
```
Sparse Structure Analysis:
  Total occupied voxels: 15234
  X range: [2, 29]
  Y range: [5, 26]
  Z range: [3, 28]
  Center: [15.2, 15.8, 14.9]
  Std dev: [6.3, 5.9, 7.1]
  Bounding box volume: 5832
```

## Parameters

### `visualize_sparse_structure` (bool)
- **Default:** `False`
- **Description:** Enable or disable sparse structure visualization
- **Usage:** Set to `True` to visualize the sparse structure after generation

### `visualize_save_dir` (str or None)
- **Default:** `None`
- **Description:** Directory path to save visualization images
- **Usage:** 
  - `None`: Display visualizations interactively (blocks execution)
  - `"/path/to/dir"`: Save visualizations to disk (non-blocking)

## Understanding the Visualizations

### What You're Seeing

The sparse structure coordinates have shape `[N, 4]` where:
- **Column 0**: Batch index (always 0 for single samples)
- **Column 1**: X coordinate (0 to resolution-1)
- **Column 2**: Y coordinate (0 to resolution-1)
- **Column 3**: Z coordinate (0 to resolution-1)

### Resolution Differences

Different pipeline types use different sparse structure resolutions:

| Pipeline Type | Sparse Structure Resolution | Grid Size | Typical Voxel Count |
|--------------|----------------------------|-----------|---------------------|
| 512 | 32 | 32³ = 32,768 | ~5,000 - 15,000 |
| 1024 | 64 | 64³ = 262,144 | ~20,000 - 50,000 |
| 1024_cascade | 32 | 32³ = 32,768 | ~5,000 - 15,000 |
| 1536_cascade | 32 | 32³ = 32,768 | ~5,000 - 15,000 |

**Note:** Cascade modes use the same sparse structure resolution as 512, but later upsample during shape generation.

### Color Coding

- **Z-coordinate coloring**: Points are colored by their Z position (using viridis colormap)
- **Higher Z values**: Yellow/green (top of object)
- **Lower Z values**: Purple/blue (bottom of object)

## Examples

### Example 1: Compare Different Pipeline Types

```python
import os

for pipeline_type in ['512', '1024_cascade', '1536_cascade']:
    print(f"Generating with {pipeline_type}...")
    mesh = pipeline.run(
        image,
        seed=42,
        pipeline_type=pipeline_type,
        visualize_sparse_structure=True,
        visualize_save_dir=f'comparison/{pipeline_type}',
    )
```

### Example 2: Debug Generation Issues

```python
# Generate with visualization to check sparse structure
mesh = pipeline.run(
    image,
    seed=42,
    pipeline_type='1024_cascade',
    visualize_sparse_structure=True,
    visualize_save_dir='debug_output',
)

# If sparse structure looks abnormal, you can:
# 1. Check if voxel count is too high/low
# 2. Verify coordinate ranges are within expected bounds
# 3. Compare with known good examples
```

### Example 3: Batch Analysis

```python
import pandas as pd

results = []

for seed in range(10):
    coords = pipeline.sample_sparse_structure(
        pipeline.get_cond([image], 512),
        resolution=32,
        num_samples=1,
        sampler_params={}
    )
    
    coords_np = coords.cpu().numpy()
    results.append({
        'seed': seed,
        'num_voxels': len(coords),
        'x_range': coords_np[:, 1].max() - coords_np[:, 1].min(),
  # ... more fields
    })

df = pd.DataFrame(results)
print(df.describe())
```

### Example 4: Complete Cascade Visualization

```python
# Visualize complete cascade process with all stages
mesh = pipeline.run(
    image,
    seed=42,
    pipeline_type='1024_cascade',
    visualize_sparse_structure=True,
    visualize_save_dir='complete_cascade',
)

# This creates visualizations for:
# 1. Initial sparse structure
# 2. HR coordinates (upsampled)
# 3. Quantized coordinates
# 4. Final SLat features
# 5. Texture features
```

## Troubleshooting

### Issue: Plots don't display

**Solution:** Make sure you're running in an environment with display support (not headless). For headless environments, use `visualize_save_dir` to save files instead.

### Issue: Too many voxels, visualization is slow

**Solution:** The visualization can be slow for very large sparse structures (>50,000 voxels). Consider:
1. Using lower resolution pipeline types
2. Saving to disk instead of interactive display
3. Using statistical analysis instead of full visualization

### Issue: Out of memory during visualization

**Solution:** Matplotlib can use significant memory for large plots. Try:
1. Saving to disk instead of interactive display
2. Using only 2D projections method
3. Using statistical analysis only

## Advanced Usage

### Custom Visualization

You can also call individual visualization methods directly:

```python
# Get sparse structure
coords = pipeline.sample_sparse_structure(
    pipeline.get_cond([image], 512),
    resolution=32,
    num_samples=1,
    sampler_params={}
)

# Use specific visualization method
pipeline.visualize_sparse_structure_projections(
    coords,
    resolution=32,
    title="My Custom Title",
    save_path="custom_output.png"
)
```

### Integration with Existing Code

```python
# In your existing pipeline code
def my_generation_function(image, seed):
    # Generate sparse structure
    coords = pipeline.sample_sparse_structure(
        pipeline.get_cond([image], 512),
        resolution=32,
        num_samples=1,
        sampler_params={}
    )
    
    # Analyze
    pipeline.analyze_sparse_structure(coords)
    
    # Continue with generation
    shape_slat = pipeline.sample_shape_slat(
        pipeline.get_cond([image], 512),
        pipeline.models['shape_slat_flow_model_512'],
        coords,
        {}
    )
    
    # ... rest of your code
```

## References

- Main pipeline code: `trellis2/pipelines/trellis2_image_to_3d.py`
- Example script: `example_visualization.py`
- Sparse structure sampling: `sample_sparse_structure()` method (line 189-236)
- Visualization methods: Lines 472-690 in `trellis2_image_to_3d.py`