We present an efficient implementation of volumetric anisotropic image diffusion filters on modern programmable graphics processing units (GPUs), where the mathematics behind volumetric diffusion is effectively reduced to the diffusion in 2D images. We hereby avoid the computational bottleneck of a time consuming eigenvalue decomposition in ℝ3. Instead, we use a projection of the Hessian matrix along the surface normal onto the tangent plane of the local isodensity surface and solve for the remaining two tangent space eigenvectors. We derive closed formulas to achieve this and prevent the GPU code from branching. We show that our most complex volumetric anisotropic diffusion filters gain a speed up of more than 600 compared to a CPU solution.