Preprint on UNITE - Unified Semantic Transformer for 3D Scene Understanding available on arXiv