Nvidia Meshtron
NVIDIA Meshtron Overview
What is Meshtron?
Meshtron is an autoregressive model for generating high-quality 3D meshes with artist-like topology. It provides a scalable, data-driven solution capable of generating intricate meshes with up to 64K faces at 1024-level coordinate resolution.
Key Features
- Scalability: Over 10x higher face count than existing methods
- Resolution: 8x higher coordinate resolution compared to previous approaches
- Efficiency: 2.5x faster token throughput and 50% memory savings
- Control Options: Adjustable parameters for customization
Control Inputs
- Point Cloud: Determines output mesh shape
- Face Count: Controls mesh density
- Quad Ratio: Switches between quad and triangle tessellation
- Creativity: Adjusts generation of additional details
Technical Architecture
Meshtron uses an Hourglass Transformer architecture with three stages:
- Coordinate level: Processes every token
- Vertex level: 3x reduction in sequence length
- Face level: 9x reduction in sequence length
Key Innovations
- Sliding window attention with 8192 face context window
- Cross-attention implementation for control inputs
- Efficient token sequence processing
- Bottom-to-top generation process
Applications
- Standalone remesher for existing meshes
- Integration with text-to-3D systems
- Integration with image-to-3D models
- Generation of artist-grade meshes