Recently, researchers in the empirical software engineering community have become increasingly interested in using eye tracking to understand how developers read and write code. However, converting eye gaze data into semantic information (e.g., line, column, token) is technically challenging.
While existing tools like CodeGRITS or iTrace provide in-IDE solutions, many researchers still prefer to use web-based experimental setups for several reasons:
- Convenient study design. Participants may not need to modify code or rely on IDE features;
- Support for collecting other data types. Enables gathering additional data, such as summary writing or surveys;
- Feasibility of remote data collection. Allows easy link sharing with participants, particularly using webcam-based eye tracking.
However, existing web-based experimental setups (e.g., [1], [2], [3]) present several limitations:
- No code highlighting. That differs from participants' daily coding experience and may affect their reading behavior;
- Labor-intensive and inaccurate post-processing. For example, using OpenCV to detect the bounding box of each token from screenshots and OCR to recognize the text;
- Limited code length. The entire code must be displayed on the screen, and participants cannot scroll;
- Difficulty in explaining AST semantics. This requires reparsing the code and remapping the tokens to AST nodes post-hoc;
- No support for code editing.
We developed a technical workflow for a web-based eye tracking code editor that addresses these limitations. The key idea is to use CodeMirror, a popular web-based code editor, to provide code highlighting and editing features. We convert eye gaze data into semantic information by leveraging CodeMirror's APIs and resolving numerous technical issues.
Below is a snapshot of our tool in action (using the mouse as a proxy for eye gaze). Feel free to try our live demo as well!
The main technical details can be found in /src/components/CodeMirrorEditor.js
. Please refer to them if you want to adapt this tool for your research. We also provided an example for publishing gaze/mouse data streams from a Python server in /public/mouse_simulation.py
(often needed in practice, as the Tobii Pro SDK doesn't provide JavaScript APIs).
We previously tried the Monaco Editor, another popular web-based code editor with core features same as VSCode. However, Monaco Editor doesn't offer any APIs to convert coordinates to the offset or line/column position in the code, which is essential for analyzing eye tracking data.
For more information, please contact Ningzhi Tang from the SaNDwich Lab at the University of Notre Dame. I'm happy to discuss the technical details and potential collaborations!
If you prefer an in-IDE solution, check out CodeGRITS, developed by our team for JetBrains IDEs, which also supports tracking developer interactions within the IDE.