TECO#

class pymovements.datasets.TECO(name: str = 'TECO', long_name: str = 'Tsukuba Eye-tracking Corpus', mirrors: dict[str, Sequence[str]] = <factory>, resources: ResourceDefinitions = <factory>, experiment: Experiment | None = <factory>, extract: dict[str, bool] | None = None, custom_read_kwargs: dict[str, dict[str, Any]] = <factory>, column_map: dict[str, str] = <factory>, trial_columns: list[str] = <factory>, time_column: str | None = None, time_unit: str | None = None, pixel_columns: list[str] | None = None, position_columns: list[str] | None = None, velocity_columns: list[str] | None = None, acceleration_columns: list[str] | None = None, distance_column: str | None = None, filename_format: dict[str, str] | None = None, filename_format_schema_overrides: dict[str, dict[str, type]] | None = None)[source]#

TECO dataset [Nahatame et al., 2024].

The Tsukuba Eye-tracking Corpus (TECO) provides eye-tracking data from 41 Japanese learners of English, who read 30 English passages with a total of over 410,000 tokens.

This dataset includes detailed eye-movement measures such as skipping, first fixation duration, and regression, offering insights into the cognitive processes underlying second-language reading. TECO also examines the impact of lexical and reader factors, like word length and reading proficiency, on eye-tracking behavior.

The dataset aims to support research on L2 reading comprehension.

name#

The name of the dataset.

Type:: str

long_name#

The entire name of the dataset.

Type:: str

resources#

A list of dataset gaze_resources. Each list entry must be a dictionary with the following keys: - resource: The url suffix of the resource. This will be concatenated with the mirror. - filename: The filename under which the file is saved as. - md5: The MD5 checksum of the respective file.

Type:: ResourceDefinitions

filename_format#

Regular expression which will be matched before trying to load the file. Namedgroups will appear in the fileinfo dataframe.

Type:: dict[str, str] | None

filename_format_schema_overrides#

If named groups are present in the filename_format, this makes it possible to cast specific named groups to a particular datatype.

Type:: dict[str, dict[str, type]] | None

trial_columns#

The name of the trial columns in the input data frame. If the list is empty or None, the input data frame is assumed to contain only one trial. If the list is not empty, the input data frame is assumed to contain multiple trials and the transformation methods will be applied to each trial separately.

Type:: list[str]

column_map#

The keys are the columns to read, the values are the names to which they should be renamed.

Type:: dict[str, str]

custom_read_kwargs#

If specified, these keyword arguments will be passed to the file reading function.

Type:: dict[str, dict[str, Any]]

Examples

Initialize your Dataset object with the TECO definition:

>>> import pymovements as pm
>>>
>>> dataset = pm.Dataset("TECO", path='data/TECO')

Download the dataset resources:

>>> dataset.download()

Load the data into memory:

>>> dataset.load()

Methods

`__init__`([name, long_name, mirrors, ...])
`from_yaml`(path)	Load a dataset definition from a YAML file.
`to_dict`(*[, exclude_private, exclude_none])	Return dictionary representation.
`to_yaml`(path, *[, exclude_private, exclude_none])	Save a dataset definition to a YAML file.

Attributes

`acceleration_columns`
`distance_column`
`extract`
`filename_format`
`filename_format_schema_overrides`
`has_resources`	Checks for resources in `resources`.
`long_name`
`name`
`pixel_columns`
`position_columns`
`time_column`
`time_unit`
`trial_columns`
`velocity_columns`
`resources`
`column_map`
`custom_read_kwargs`
`mirrors`
`experiment`

TECO#

This Page