Paper on semantic transformer for 3D scene understanding accepted at TMLR