ETDD70#

class pymovements.datasets.ETDD70(name: str = 'ETDD70', long_name: str = 'Eye-Tracking Dyslexia Dataset', mirrors: dict[str, Sequence[str]] = <factory>, resources: ResourceDefinitions = <factory>, experiment: Experiment = <factory>, extract: dict[str, bool] | None = None, custom_read_kwargs: dict[str, dict[str, Any]] = <factory>, column_map: dict[str, str] = <factory>, trial_columns: list[str] | None = None, time_column: str = 'time', time_unit: str = 'ms', pixel_columns: list[str] = <factory>, position_columns: list[str] | None = None, velocity_columns: list[str] | None = None, acceleration_columns: list[str] | None = None, distance_column: str | None = None, filename_format: dict[str, str] | None = None, filename_format_schema_overrides: dict[str, dict[str, type]] | None = None)[source]#

Eye-Tracking Dyslexia Dataset (ETDD70) [Sedmidubsky et al., 2025].

This dataset includes binocular eye tracking data from 70 Czech children age 9-10. Eye movements are recorded at a sampling frequency of 250 Hz eye tracker and precomputed events are reported.

Each participant is instructed to read three texts:
  • Task called Syllables contains 90 syllables arranged in a 9 x 10 matrix

  • Task called MeaningfulText consists of a passage about a young boy who watches a squirrel from his window.

  • Task called PseudoText comprises fictional, meaningless words.

Check the respective paper for details [Sedmidubsky et al., 2025].

name#

The name of the dataset.

Type:

str

long_name#

The entire name of the dataset.

Type:

str

resources#

A list of dataset gaze_resources. Each list entry must be a dictionary with the following keys: - resource: The url suffix of the resource. This will be concatenated with the mirror. - filename: The filename under which the file is saved as. - md5: The MD5 checksum of the respective file.

Type:

ResourceDefinitions

experiment#

The experiment definition.

Type:

Experiment

filename_format#

Regular expression which will be matched before trying to load the file. Namedgroups will appear in the fileinfo dataframe.

Type:

dict[str, str] | None

filename_format_schema_overrides#

If named groups are present in the filename_format, this makes it possible to cast specific named groups to a particular datatype.

Type:

dict[str, dict[str, type]] | None

time_column#

The name of the timestamp column in the input data frame. This column will be renamed to time.

Type:

str

time_unit#

The unit of the timestamps in the timestamp column in the input data frame. Supported units are ‘s’ for seconds, ‘ms’ for milliseconds and ‘step’ for steps. If the unit is ‘step’ the experiment definition must be specified. All timestamps will be converted to milliseconds.

Type:

str

pixel_columns#

The name of the pixel position columns in the input data frame. These columns will be nested into the column pixel. If the list is empty or None, the nested pixel column will not be created.

Type:

list[str]

custom_read_kwargs#

If specified, these keyword arguments will be passed to the file reading function.

Type:

dict[str, dict[str, Any]]

Examples

Initialize your Dataset object with the ETDD70 definition:

>>> import pymovements as pm
>>>
>>> dataset = pm.Dataset("ETDD70", path='data/ETDD70')

Download the dataset resources:

>>> dataset.download()

Load the data into memory:

>>> dataset.load()

Methods

__init__([name, long_name, mirrors, ...])

from_yaml(path)

Load a dataset definition from a YAML file.

to_dict(*[, exclude_private, exclude_none])

Return dictionary representation.

to_yaml(path, *[, exclude_private, exclude_none])

Save a dataset definition to a YAML file.

Attributes

acceleration_columns

distance_column

extract

filename_format

filename_format_schema_overrides

has_resources

Checks for resources in resources.

long_name

name

pixel_columns

position_columns

time_column

time_unit

trial_columns

velocity_columns

resources

experiment

custom_read_kwargs

mirrors

column_map