Evaluating Graphical Perception Capabilities of Vision Transformers

Poonam Poonam

Ulm University

Pere-Pau Vázquez

Universitat Politècnica de Catalunya

Timo Ropinski

Ulm University

Computer & Graphics 2025

Paper

Abstract

Vision Transformers (ViTs) have emerged as a powerful alternative to convolutional neural networks (CNNs) in a variety of image-based tasks. While CNNs have previously been evaluated for their ability to perform graphical perception tasks, which are essential for interpreting visualizations, the perceptual capabilities of ViTs remain largely unexplored. In this work, we investigate the performance of ViTs in elementary visual judgment tasks inspired by Cleveland and McGill’s foundational studies, which quantified the accuracy of human perception across different visual encodings. Inspired by their study, we benchmark ViTs against CNNs and human participants in a series of controlled graphical perception tasks. Our results reveal that, although ViTs demonstrate strong performance in general vision tasks, their alignment with human-like graphical perception in the visualization domain is limited. This study highlights key perceptual gaps and points to important considerations for the application of ViTs in visualization systems and graphical perceptual modeling.

BibTeX

@article{poonam2025GPVIT,
	title={Evaluating Graphical Perception Capabilities of Vision Transformers},
	author={Poonam, Poonam and V{\'a}zquez, Pere-Pau and Ropinski, Timo},
	year={2025},
	journal={Computer & Graphics}
}

Home

Team

Projects

Publications

Research Seminars

Contact

Impressum

Evaluating Graphical Perception Capabilities of Vision Transformers

Abstract

BibTeX

Teaching

Evaluating Graphical Perception Capabilities of Vision Transformers

Abstract

BibTeX