UCL#
- class pymovements.datasets.UCL(name: str = 'UCL', long_name: str = 'University College London corpus', mirrors: dict[str, Sequence[str]] = <factory>, resources: ResourceDefinitions = <factory>, experiment: Experiment | None = <factory>, extract: dict[str, bool] | None = None, custom_read_kwargs: dict[str, Any] = <factory>, column_map: dict[str, str] = <factory>, trial_columns: list[str] | None = None, time_column: str | None = None, time_unit: str | None = None, pixel_columns: list[str] | None = None, position_columns: list[str] | None = None, velocity_columns: list[str] | None = None, acceleration_columns: list[str] | None = None, distance_column: str | None = None, filename_format: dict[str, str] | None = None, filename_format_schema_overrides: dict[str, dict[str, type]] | None = None)[source]#
UCL dataset [Frank et al., 2013].
UCL is a dataset of word-by-word reading times collected through self-paced reading and eye-tracking experiments to evaluate computational psycholinguistic models of English sentence comprehension. 361 sentences from narrative sources, ensuring they were understandable without context, and recorded reading times from participants using both methods.
For more details check out the original paper [Frank et al., 2013].
- name#
The name of the dataset.
- Type:
str
- long_name#
The entire name of the dataset.
- Type:
str
- resources#
A list of dataset gaze_resources. Each list entry must be a dictionary with the following keys: - resource: The url suffix of the resource. This will be concatenated with the mirror. - filename: The filename under which the file is saved as. - md5: The MD5 checksum of the respective file.
- Type:
- filename_format#
Regular expression which will be matched before trying to load the file. Namedgroups will appear in the fileinfo dataframe.
- Type:
dict[str, str] | None
- filename_format_schema_overrides#
If named groups are present in the filename_format, this makes it possible to cast specific named groups to a particular datatype.
- Type:
dict[str, dict[str, type]] | None
- column_map#
The keys are the columns to read, the values are the names to which they should be renamed.
- Type:
dict[str, str]
- custom_read_kwargs#
If specified, these keyword arguments will be passed to the file reading function.
- Type:
dict[str, Any]
Examples
Initialize your
Dataset
object with theUCL
definition:>>> import pymovements as pm >>> >>> dataset = pm.Dataset("UCL", path='data/UCL')
Download the dataset resources:
>>> dataset.download()
Load the data into memory:
>>> dataset.load()
Methods
__init__
([name, long_name, mirrors, ...])from_yaml
(path)Load a dataset definition from a YAML file.
to_dict
(*[, exclude_private, exclude_none])Return dictionary representation.
to_yaml
(path, *[, exclude_private, exclude_none])Save a dataset definition to a YAML file.
Attributes
acceleration_columns
distance_column
extract
has_resources
Checks for resources in
resources
.pixel_columns
position_columns
time_column
time_unit
trial_columns
velocity_columns
mirrors
experiment