
TRELLIS.2 image-to-3D now runs on Mac (Apple Silicon) - no NVIDIA GPU needed
I ported Microsoft's TRELLIS.2 to run on Apple Silicon via PyTorch MPS. The original depends on five CUDA-only compiled extensions (flex_gemm, flash_attn, o_voxel, cumesh, nvdiffrast) that have no Mac equivalent. Wrote replacement backends from scratch: Pure-PyTorch sparse 3D convolution (replacing flex_gemm), Python mesh extraction using spatial hashing (replacing CUDA hashmap ops in o_voxel), SDPA attention for sparse transformers (replacing flash_attn), and GPU-accelerated trilinear interpolation (replacing cumesh and nvdiffrast).