{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "0",
   "metadata": {},
   "source": [
    "# Handling Gaze Events\n",
    "\n",
    "## What you will learn in this tutorial:\n",
    "\n",
    "* how to detect different events using different algorithms like IDT, IVT and microsaccades\n",
    "* how to compute event properties like peak velocity and amplitude\n",
    "* how to save and load your event data\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1",
   "metadata": {},
   "source": [
    "## Preparations\n",
    "At first we import `pymovements` as the alias `pm` for convenience."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pymovements as pm"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3",
   "metadata": {},
   "source": [
    "Then we download a dataset `ToyDataset` and load its data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4",
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset = pm.Dataset('ToyDataset', path='data/ToyDataset')\n",
    "dataset.download()\n",
    "dataset.load()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5",
   "metadata": {},
   "source": [
    "The dataset consist of gaze data in 20 files (check `Dataset/gaze` above). Every `Gaze` has some samples with six columns (check `Gaze/samples`): [time, stimuli_x, stimuli_y, text_id, page_id, pixel]. The `Gaze/events` DataFrame is empty so far. To be able to calculate events, we need to do some basic preprocessing, which will add new columns to the dataset samples DataFrame:\n",
    "\n",
    "* `pix2deg()`: adds `position` column with degrees from the screen center needed by the `idt` algorithm\n",
    "* `pos2vel()`: adds `velocity` column with gaze velocities needed by `microsaccades` and `ivt` algorithms\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6",
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset.pix2deg()\n",
    "dataset.pos2vel('smooth')\n",
    "dataset.gaze[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7",
   "metadata": {},
   "source": [
    "Now every `Gaze/samples` DataFrame has two more columns: position and velocity which will be used by the event detection algorithms."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8",
   "metadata": {},
   "source": [
    "## Detecting Events\n",
    "\n",
    "*pymovements* provides a range of event detection methods for several types of gaze events.\n",
    "\n",
    "See the reference for [pymovements.events](https://pymovements.readthedocs.io/en/latest/reference/pymovements.events.html) to get an overview of all the supported methods.\n",
    "\n",
    "For this tutorial we will use the I-DT and I-VT (`idt` and `ivt`) algorithms for detecting fixations and the `microsaccades` algorithm for detecting saccades.\n",
    "\n",
    "Let's start with fixations detection using the `idt` algorithm with the `dispersion_threshold` equal to 2.7:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9",
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset.detect_events('idt', dispersion_threshold=2.7)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "10",
   "metadata": {},
   "source": [
    "The detected events are added as rows with the name `fixation` to the event dataframe:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "11",
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset.events[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "12",
   "metadata": {},
   "source": [
    "As you can see, 56 fixations were found for the first file."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "13",
   "metadata": {},
   "source": [
    "Now let's try another algorithm `ivt` with velocity_threshold=20. Because we don't want to mix fixations found by different algorithms we add `name` parameter with 'fixation.ivt'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "14",
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset.detect_events('ivt', velocity_threshold=20, name='fixation.ivt')\n",
    "dataset.events[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "15",
   "metadata": {},
   "source": [
    "Now we have additional rows with name='fixations.ivt'."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "16",
   "metadata": {},
   "source": [
    "Let's try to use the `microsaccades` algorithm to detect fixations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "17",
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset.detect_events('microsaccades', minimum_duration=12)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "18",
   "metadata": {},
   "source": [
    "The detected events are added as rows with the name `saccade` to the event dataframe:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "19",
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset.events[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "20",
   "metadata": {},
   "source": [
    "Now there are three sets of events in the `dataset.events` DataFrame with different values in the 'name' column:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "21",
   "metadata": {},
   "outputs": [],
   "source": [
    "set(dataset.events[0].frame['name'])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "22",
   "metadata": {},
   "source": [
    "## Computing Event Properties"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "23",
   "metadata": {},
   "source": [
    "*pymovements* provides a range of event properties.\n",
    "\n",
    "See the reference for [pymovements.events](https://pymovements.readthedocs.io/en/latest/reference/pymovements.events.html) to get an overview of all the supported properties.\n",
    "\n",
    "For this tutorial we will compute several properties of saccades.\n",
    "\n",
    "We start out with the peak velocity:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "24",
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset.compute_event_properties(\"peak_velocity\")\n",
    "\n",
    "dataset.events[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "25",
   "metadata": {},
   "source": [
    "Check above that a new column with the name `peak_velocity` has appeared in the event DataFrame.\n",
    "\n",
    "We can also pass a list of properties. Let's add the amplitude and dispersion:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "26",
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset.compute_event_properties([\"amplitude\", \"dispersion\"])\n",
    "\n",
    "dataset.events[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "27",
   "metadata": {},
   "source": [
    "This way we can compute all of our desired properties in a single run."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "28",
   "metadata": {},
   "source": [
    "## Saving Event Data"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "29",
   "metadata": {},
   "source": [
    "Saving your event data is as simple as:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "30",
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset.save_events()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "31",
   "metadata": {},
   "source": [
    "All of the event data is saved into this directory:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "32",
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset.paths.events"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "33",
   "metadata": {},
   "source": [
    "Let's confirm it by printing all files in this directory:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "34",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(list(dataset.paths.events.glob('*/*/*')))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "35",
   "metadata": {},
   "source": [
    "All files have been saved into the `Dataset.paths.events` as files in [Feather format](https://arrow.apache.org/docs/python/feather.html). \n",
    "\n",
    "If we want to save the data into an alternative directory and also use a different file format like `csv` we can use the following:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "36",
   "metadata": {},
   "outputs": [],
   "source": [
    "dataset.save_events(events_dirname='events_csv', extension='csv')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "37",
   "metadata": {},
   "source": [
    "Let's confirm again by printing all the new files in this alternative directory:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "38",
   "metadata": {},
   "outputs": [],
   "source": [
    "alternative_dirpath = dataset.path / 'events_csv'\n",
    "print(list(alternative_dirpath.glob('*/*/*')))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "39",
   "metadata": {},
   "source": [
    "### Loading Previously Computed Events Data\n",
    "\n",
    "Let's initialize a new dataset object from the same `ToyDataset`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "40",
   "metadata": {},
   "outputs": [],
   "source": [
    "preprocessed_dataset = pm.Dataset('ToyDataset', path='data/ToyDataset')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "41",
   "metadata": {},
   "source": [
    "When we load the dataset using `load()` without any parameters there will be no events loaded:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "42",
   "metadata": {},
   "outputs": [],
   "source": [
    "preprocessed_dataset.load()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "43",
   "metadata": {},
   "source": [
    "But when we load it with the `events=True` parameter the events will be loaded:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "44",
   "metadata": {},
   "outputs": [],
   "source": [
    "preprocessed_dataset.load(events=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "45",
   "metadata": {},
   "source": [
    "By default, the `events` directory and the `feather` extension will be chosen.\n",
    "\n",
    "In case of alternative directory names or other file formats you can use the following:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "46",
   "metadata": {},
   "outputs": [],
   "source": [
    "preprocessed_dataset.load(\n",
    "    events=True,\n",
    "    events_dirname='events_csv',\n",
    "    extension='csv',\n",
    ")\n",
    "dataset.events[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "47",
   "metadata": {},
   "source": [
    "## What you have learned in this tutorial:\n",
    "\n",
    "* detecting different events with different algorithms by using `Dataset.detect_events()`\n",
    "* computing event properties by using `Dataset.compute_event_properties()`\n",
    "* saving your preprocesed data using `Dataset.save_preprocessed()`\n",
    "* load your preprocesed data using `Dataset.load(events=True)`\n",
    "* using custom directory names by specifying `preprocessed_dirname`\n",
    "* using other file formats than the default `feather` format by specifying `extension`"
   ]
  }
 ],
 "metadata": {},
 "nbformat": 4,
 "nbformat_minor": 5
}