The paper contributes to scanpath bundling methods. We propose an analytical approach for statistical comparisons of aggregated scanpath visualizations by means of second-order gaze analysis metrics. The present study explores differences in attention distribution and cognitive processing over architectural objects between architects, art historians, and non-experts. The results show between-group differences in attention dynamics of the aggregated scanpaths. The aggregated scanpaths of both expert groups were focal, while non-experts' scanpaths were ambient. Experts also paid more attention to and tended to remember better the architectural details and their location. The discussion explores the scalability of the proposed approach for Human-Computer Interaction and accessibility technologies designed to enhance the experience of cultural heritage.

Honorable Mention for Full Paper

Classification of Alzheimer's Using Deep-Learning Methods on Webcam-Based Gaze Data

Anuj Harisinghani
Harshinee Sriram
Cristina Conati
Giuseppe Carenini
Thalia Field
Hyeju Jang
Gabriel Murray

There has been increasing interest in non-invasive predictors of Alzheimer's disease (AD) as an initial screen for this condition. Previously, successful attempts leveraged eye-tracking and language data generated during picture narration and reading tasks. These results were obtained with high-end, expensive eye-trackers. Instead, we explore classification using eye-tracking data collected with a webcam, where our classifiers are built using a deep-learning approach. Our results show that the webcam gaze classifier is not as good as the classifier based on high-end eye-tracking data. However, the webcam-based classifier still beats the majority-class baseline classifier in terms of AU-ROC, indicating that predictive signals can be extracted from webcam gaze tracking. Hence, although our results indicate that there is still a long way to go before webcam gaze tracking can reach practical relevance, they still provide an encouraging proof of concept that this technology should be further explored as an affordable alternative to high-end eye-trackers for the detection of AD.

Best Short Paper

Bridging the Gap: Gaze Events as Interpretable Concepts to Explain Deep Neural Sequence Models

Daniel Krakowczykr
Paul Prasse
David Reich
Sebastian Lapuschkin
Tobias Scheffer
Lena Jäger

Recent work in XAI for eye tracking data has evaluated the suitability of feature attribution methods to explain the output of deep neural sequence models for the task of oculomotric biometric identification. These methods provide saliency maps to highlight important input features of a specific eye gaze sequence. However, to date, its localization analysis has been lacking a quantitative approach across entire datasets. In this work, we employ established gaze event detection algorithms for fixations and saccades and quantitatively evaluate the impact of these events by determining their concept influence. Input features that belong to saccades are shown to be substantially more important than features that belong to fixations. By dissecting saccade events into sub-events, we are able to show that gaze samples that are close to the saccadic peak velocity are most influential. We further investigate the effect of event properties like saccadic amplitude or fixational dispersion on the resulting concept influence.

Honorable Mention for Short Paper

The Salient360! Toolbox: Processing, Visualising and Comparing Gaze Data in 3D

Erwan David
Jesús Gutiérrez
Melissa Le-Hoa Vo
Antoine Coutrot
Matthieu Perreira da Silva
Patrick Le Callet

Eye tracking can serve as a gateway to studying the mind. For this reason it has been adopted by a diverse range of scientific communities. With the improvement of the quality of head-mounted virtual reality devices (HMDs) over the past 10 years, eye tracking has been added to capture gaze in immersive environments. The use of HMDs with eye tracking is increasing significantly and so is the need for a toolbox enabling consensus about eye tracking methods in 3D. We present the Salient360! toolbox: it implements functions to identify saccades and fixations and output gaze characteristics (e.g., fixation duration or saccade directions), to generate saliency maps, fixation maps, and scanpath data. It also implements routines made to compare gaze data that were adapted to 3D. We hope that this toolbox will spark discussions about the methodology of 3D gaze processing, facilitate running experiments, and improve the gaze study in 3D. https://github.com/David-Ef/salient360Toolbox

Best Late Breaking Work

The Tiny Eye Movement Transformer

Wolfgang Fuhl
Anne Herrmann-Werner
Kay Nieselt

In this paper, we evaluate different small neural network models for eye movement classification and show our so far developed improved model architecture. For evaluation, we used a subset (1.5 million sequences) of the TEyeDS annotations since it contains in the wild recordings and has the most eye movement annotations to our knowledge. We classified fixations, saccades, and smooth pursuits with four different network architectures and the proposed model improves the equally weighted accuracy by 3.8% to the best competitor while only using 6% of the amount of learnable weights.

Best Poster

Navigating Virtual Worlds: Examining Spatial Navigation Using a Graph Theoretical Analysis of Eye Tracking Data Recorded in Virtual Reality

Jasmin L. Walter
Vincent Schmidt
Sabine U. König
Peter König

In this work we apply a graph-theoretical analysis approach to eye tracking data recorded in virtual reality to investigate the underlying patterns of visual attention during spatial navigation. Based on the eye tracking data recorded in one virtual city, our graph-theoretical analysis identifies a subset of houses outstanding in their graph-theoretical properties which we define as gaze-graph-defined landmarks. Moreover, we are able to replicate these results with a different eye tracking data set recorded in a different virtual city. Finally, the initial model selection process of the participant’s performance in a point-to-building task in the second city suggests a stronger influence of graph-theoretical predictors on the performance compared to the non-graph related measures, however more research will be necessary to determine their relationship

COGAIN 2023 Best Paper

GazeCast: Using Mobile Devices to Allow Gaze-based Interaction on Public Displays

Omar Namnakani
Penpicha Sinrattanavong
Yasmeen Abdrabou
Andreas Bulling
Florian Alt
Mohamed Khamis

Gaze is promising for natural and spontaneous interaction with public displays, but current gaze-enabled displays require movement-hindering stationary eye trackers or cumbersome head-mounted eye trackers. We propose and evaluate GazeCast – a novel system that leverages users’ handheld mobile devices to allow gaze-based interaction with surrounding displays. In a user study (N = 20), we compared GazeCast to a standard webcam for gaze-based interaction using Pursuits. We found that while selection using GazeCast requires more time and physical demand, participants value GazeCast’s high accuracy and flexible positioning. We conclude by discussing how mobile computing can facilitate the adoption of gaze interaction with pervasive displays.

COGAIN Impact Award

A Fitts' Law Study of Click and Dwell Interaction by Gaze, Head and Mouse with a Head-Mounted Display

John Paulin Hansen
Vijay Rajanna
Scott MacKenzie
Per Bækgaard

Gaze and head tracking, or pointing, in head-mounted displays enables new input modalities for point-select tasks. We conducted a Fitts' law experiment with 41 subjects comparing head pointing and gaze pointing using a 300 ms dwell (n = 22) or click (n = 19) activation, with mouse input providing a baseline for both conditions. Gaze and head pointing were equally fast but slower than the mouse; dwell activation was faster than click activation. Throughput was highest for the mouse (2.75 bits/s), followed by head pointing (2.04 bits/s) and gaze pointing (1.85 bits/s). With dwell activation, however, throughput for gaze and head pointing were almost identical, as was the effective target width (≈ 55 pixels; about 2°) for all three input methods. Subjective feedback rated the physical workload less for gaze pointing than head pointing.

Accepted Papers

Long Papers Short Papers Doctoral Consortium

Long Papers

A unified look on cultural heritage. Comparison of aggregated scanpaths between experts and non-experts in architecture.

Krzysztof Krejtz (SWPS University of Social Sciences and Humanities, Poland), Patryk Szczeciński (SWPS University, Poland), Aneta Pawłowska Pawłowska (UNIVERSITY OF LODZ, Poland), Daria Rutkowska-Siuda (University of Łódź, Poland), Katarzyna Wisiecka (SWPS University, Poland), Piotr Milczarski (University of Lodz, Poland), Artur Hłobaż (University of Lodz, Poland)

DynamicRead: Exploring Robust Gaze Interaction Methods for Reading on Handheld Mobile Devices under Dynamic Conditions

Yaxiong Lei (University of St Andrews, United Kingdom), Yuheng Wang (Centre for Research into Ecological and Environmental Modelling, School of Mathematics and Statistics, University of St Andrews, United Kingdom), Tyler Caslin (University of St Andrews, United Kingdom), Alexander Wisowaty (University of St Andrews, United Kingdom), Xu Zhu (University of St Andrews, United Kingdom), Mohamed Khamis (University of Glasgow, United Kingdom), Juan Ye (University of St Andrews, United Kingdom)

Eyettention: An Attention-based Dual-Sequence Model for Predicting Human Scanpaths during Reading

Shuwen Deng (University of Potsdam, Germany), David Reich (Universität Potsdam, Germany), Paul Prasse (University of Potsdam, Germany), Patrick Haller (University of Zurich, Switzerland), Tobias Scheffer (University of Potsdam, Germany), Lena Jäger (University of Zurich, Switzerland)

Towards Modeling Human Attention from Eye Movements for Neural Source Code Summarization

Aakash Bansal (University of Notre Dame, United States), Bonita Sharif (University of Nebraska - Lincoln, United State), Collin McMillan (University of Notre Dame, United States)

Unconscious frustration: dynamically assessing user experience using eye and mouse tracking

Scott Stone (University of Alberta, Canada), Craig Chapman (University of Alberta, Canada)

G-DAIC: A Gaze Initialized Framework for Description and Aesthetic-Based Image Cropping

Nora Horanyi (University of Birmingham, United Kingdom), Yuqi Hou (University of Birmingham, United Kingdom), Aleš Leonardis (University of Birmingham, United Kingdom), Hyung Jin Chang (University of Birmingham, United Kingdom)

Classification of Alzheimer's using deep-learning methods on webcam-based gaze data

Anuj Harisinghani (UBC, Canada), Cristina Conati (UBC, Canada), Giuseppe Carenini (UBC, Canada), Thalia Field (UBC, Canada), Hyeju Jang (UBC, Canada), Gabriel Murray (University of the Fraser Valley, Canada)

Exploring Dwell-time from Human Cognitive Processes for Dwell Selection

Toshiya Isomoto (University of Tsukuba, Japan), Shota Yamanaka (Yahoo Japan Corporation, Japan), Buntarou Shizuki (University of Tsukuba, Japan)

Exploring the Effects of Scanpath Feature Engineering for Supervised Image Classification Models

Sean Anthony Byrne (IMT Lucca, Italy), Virmarie Maquiling (Eberhard Karls University of Tuebingen, Germany), Adam Peter Frederick Reynolds (IMT School for Advanced Studies, Italy), Luca Polonio (University of Milano - Bicocca, Italy), Nora Castner (University of Tübingen, Germany), Enkelejda Kasneci (Technical University of Munich, Germany)

Exploring Gaze-assisted and Hand-based Region Selection in Augmented Reality

Rongkai Shi (Xi'an Jiaotong-Liverpool University, China), Yushi Wei (Xi'an Jiaotong-Liverpool University, China), Xueying Qin (Shandong University, China), Pan Hui (The Hong Kong University of Science and Technology, China, University of Helsinki, Finland), Hai-Ning Liang (Xi'an Jiaotong-Liverpool University, China)

Investigating Privacy Perceptions and Subjective Acceptance of Eye Tracking on Handheld Mobile Devices

Noora Alsakar (Imam Abdulrahman Bin Faisal University, Saudi Arabia, University of Glasgow, United Kingdom), Yasmeen Abdrabou (University of the Bundeswehr Munich, Germany), Simone Stumpf (University of Glasgow, United Kingdom), Mohamed Khamis (University of Glasgow, United Kingdom)

Studying Developer Eye Movements to Measure Cognitive Workload and Visual Effort for Expertise Assessment

Salwa Aljehane (University of Tabuk, Saudi Arabia), Bonita Sharif (University of Nebraska - Lincoln, United States), Jonathan Maletic (Kent State University, United States)

Practical Perception-Based Evaluation of Gaze Prediction for Gaze Contingent Rendering

Samantha Aziz (Texas State University, United States), Dillon Lohr (Texas State University, United States), Razvan Stefanescu (Meta, United States), Oleg Komogortsev (Texas State University, United States)

Short Papers

GEAR: Gaze-enabled augmented reality for human activity recognition

Kenan Bektas (University of St. Gallen, Institute of Computer Science, Switzerland), Jannis Strecker (University of St. Gallen, Institute of Computer Science, Switzerland), Simon Mayer (University of St. Gallen, Institute of Computer Science, Switzerland), Dr. Kimberly Garcia (University of St. Gallen, Institute of Computer Science, Switzerland), Jonas Hermann (University of St. Gallen, Institute of Computer Science, Switzerland), Kay Erik Jenss (University of St. Gallen, Institute of Computer Science, Switzerland), Yasmine Antille (University of St. Gallen, Institute of Computer Science, Switzerland), Marc Soler (University of St. Gallen, Institute of Computer Science, Switzerland)

Comparing Visual Search Patterns in Chest X-Ray Diagnostics

Catarina Moreira (Queensland University of Technology, Australia, Instituto Superior Técnico / INESC-ID, Portugal), Diogo Miguel Alvito (Instituto Superior Técnico, Portugal), Sandra Costa Sousa (Grupo Lusíadas, Portugal), Isabel Maria Gomes Blanco Nobre (Grupo Lusíadas, Portugal), Chun Ouyang (Queensland University of Technology, Australia), Regis Kopper (University of North Carolina at Greensboro, United States), Andrew Duchowski (Clemson University, United States), Joaquim Jorge (Universidade de Lisboa, Portugal)

Getting the Most from Eye-Tracking: An Evaluation of Improving User-Interaction Based Reading Estimation Models

Ruoyan Kong (University of Minnesota, United States), Ruixuan Sun (University of Minnesota, United States), Charles Chuankai Zhang (University of Minnesota, United States), Chen Chen (University of Minnesota, United States), Sneha Patri (University of Minnesota, United States), Gayathri Gajjela (University of Minnesota, United States), Joseph Konstan (University of Minnesota, United States)

The Salient360! Toolbox: processing, visualising, and comparing gaze data in 3D

Erwan David (Goethe-Universität, Germany), Jesús Gutiérrez (Universidad Politécnica de Madrid, Spain), Melissa Le-Hoa Vo (Goethe University, Germany), Antoine Coutrot (University of Lyon, France), Matthieu Perreira da Silva (University of Nantes, France), Patrick Le Callet (LS2N/université de Nantes, France)

Multi-Rate Sensor Fusion for Unconstrained Near-Eye Gaze Estimation

Cristina Palmero (Universitat de Barcelona, Spain, Computer Vision Center, Spain), Oleg Komogortsev (Texas State University, United States), Sergio Escalera (Universitat de Barcelona, Spain), Sachin Talathi (26 Value Partners LP, United States)

Gaze Pattern Recognition in Dyadic Communication

Fei Chang (Institute of Computing Technology, Chinese Academy of Science, China), Jiabei Zeng (Institute of Computing Technology, Chinese Academy of Science, China), Qiaoyun Liu (East China Normal University, China), Shiguang Shan (Institute of Computing Technology, Chinese Academy of Science, China)

Bridging the Gap: Eye Movement Events as Interpretable Concepts to Explain the Output of Deep Neural Sequence Models

Daniel Krakowczyk (University of Potsdam, Germany), Paul Prasse (University of Potsdam, Germany), David Reich (Universität Potsdam, Germany), Sebastian Lapuschkin (Fraunhofer Heinrich Hertz Institute, Germany), Tobias Scheffer (University of Potsdam, Germany), Lena Jäger (University of Zurich, Switzerland)

On The Visibility Of Fiducial Markers For Mobile Eye Tracking

Naila Ayala (University of Waterloo, Canada), Diako Mardanbegi (Adhawk Microsystems, Canada), Andrew Duchowski (Clemson University, United States), Ewa Niechwiej-Szwedo (University of Waterloo, Canada), Shi Cao (Univesity of Waterloo, Canada), Suzanne Kearns (University of Waterloo, Canada), Elizabeth Irving (University of Waterloo, Canada)

Pupil Diameter during Counting Tasks as Potential Baseline for Virtual Reality Experiments

Philipp Stark (University of Tübingen, Germany), Tobias Appel (University of Tübingen, Germany), Milo Olbrich (Stuttgart Media University, Germany), Enkelejda Kasneci (Technical University of Munich, Germany)

TF-IDF based Scene-Object Relations Correlate With Visual Attention

Pelin Celikkol (University of Potsdam, Germany), Jochen Laubrock (University of Potsdam, Germany), David Schlangen (University of Potsdam, Germany)

Introducing Explicit Gaze Constraints to Face Swapping

Ethan Wilson (University of Florida, United States), Frederick Shic (Seattle Children's Research Institute, United States), Eakta Jain (University of Florida, United States)

GE-Simulator: An Open-Source Tool for Simulating Real-Time Errors for HMD-based Eye Trackers

Ludwig Sidenmark (Lancaster University, United Kingdom), Mathias Lystbæk (Aarhus University, Denmark), Hans Gellersen (Lancaster University, United Kingdom)

Eye tracking to evaluate the effectiveness of electronic medical record training

Nadine Moacdieh (Carleton University, Canada, American University of Beirut, Lebanon), Michel Dibo (American University of Beirut, Lebanon), Zeina Halabi (American University of Beirut, Lebanon), Jumana Antoun (American University of Beirut, Lebanon)

Prediction Procedure for Dementia Levels based on Waveform Features of Binocular Pupil Light Reflex

Minoru Nakayama (Tokyo Institute of Technology, Japan), Wioletta Nowak (Wroclaw University of Science and Technology, Poland), Anna Zarowska (Wroclaw University of Science and Technology, Poland)

Synthetic predictabilities from large language models explain reading eye movements

Johan Chandra (Brandenburg Medical School, Germany), Nicholas Witzig (Brandenburg Medical School, Germany), Jochen Laubrock (University of Potsdam, Germany)

Visual Center Biasing in a Stimulus-Free Laboratory Setting

Jan Ehlers (Bauhaus-Universität Weimar, Germany), Janine Grimmer (Universität Ulm, Germany)

Area of interest adaption using feature importance

Wolfgang Fuhl (Wilhelm Schickard Institut, Germany), Susanne Zabel (University Tübingen, Germany), Theresa Harbig (University of Tuebingen, Germany), Julia-Astrid Moldt (University Hospital Tübingen, Germany), Teresa Festl Wietek (University Hospital Tübingen, Germany), Anne Herrmann-Werner (University Hospital Tübingen, Germany), Kay Nieselt (Institute for Bioinformatics and Medical Informatics, Germany)

Visual Perception and Performance: An Eye Tracking Study

Sonali Aatrai (Indian institute of technology Kharagpur, India), Sparsh Jha (Indian Institute of Technology Kharagpur, Indi), Rajlakshmi Guha (Indian Institute of Technology Kharagpur, India)

Gaze-based Mode-Switching to Enhance Interaction with Menus on Tablets

Yanfei Hu (Fleischhauer LMU Munich, Germany), Hemant Surale (University of Waterloo, Canada), Florian Alt (University of the Bundeswehr Munich, Germany), Ken Pfeuffer (Aarhus University, Denmark, Bundeswehr University Munich, Germany)

One step closer to EEG based eye tracking

SP-EyeGAN: Generating Synthetic Eye Movement Data with Generative Adversarial Networks

Paul Prasse (University of Potsdam, Germany), David Reich (Universität Potsdam, Germany), Silvia Makowski (University of Potsdam, Germany), Seoyoung Ahn (Stony Brook University, United States), Tobias Scheffer (University of Potsdam, Germany), Lena Jäger (University of Zurich, Switzerland)

Predicting the Allocation of Attention: Using contextual guidance of eye movements to examine the distribution of attention

Karolina Krzyś (Queen's University, Canada), Mubeena Mistry (University of Toronto, Canada), Tyler Yan (University of Toronto, Canada), Monica Castelhano (Queen's University, Canada)

A Deep Learning Architecture for Egocentric Time-to-Saccade Prediction using Weibull Mixture-Models and Historic Priors

Tim Rolff (Universität Hamburg, Germany), Susanne Schmidt (Universität Hamburg, Germany), Frank Steinicke (Universität Hamburg, Germany), Simone Frintrop (Universität Hamburg, Germany)