graphomotor.plot.feature_plots

Feature visualization functions for Graphomotor.

This module provides plotting functions for visualizing extracted features from spiral drawing data. The plotting functions expect CSV files with the first 5 columns reserved for metadata (source_file, participant_id, task, hand, start_time), and treat all subsequent columns as numerical features.

Available Features

The graphomotor toolkit extracts 25 features from spiral drawing data. For a complete list of all available features, see the features module documentation.

Custom Features

Users can add custom feature columns to their CSV files alongside the standard graphomotor features. Any additional columns after the first 5 metadata columns will be automatically detected and available for plotting.

Plot Types

Distribution plots: Kernel density estimation plots showing feature distributions grouped by task type and hand.
Trend plots: Line plots displaying feature progression across task sequences.
Box plots: Box-and-whisker plots comparing distributions across conditions.
Cluster heatmaps: Hierarchically clustered heatmaps of standardized features.

View Source

  1"""Feature visualization functions for Graphomotor.
  2
  3This module provides plotting functions for visualizing extracted features from spiral
  4drawing data. The plotting functions expect CSV files with the first 5 columns reserved
  5for metadata (`source_file`, `participant_id`, `task`, `hand`, `start_time`), and treat
  6all subsequent columns as numerical features.
  7
  8Available Features
  9------------------
 10The graphomotor toolkit extracts 25 features from spiral drawing data.
 11For a complete list of all available features, see the
 12[features module documentation](https://childmindresearch.github.io/graphomotor/graphomotor/features.html).
 13
 14Custom Features
 15---------------
 16Users can add custom feature columns to their CSV files alongside the standard
 17graphomotor features. Any additional columns after the first 5 metadata columns
 18will be automatically detected and available for plotting.
 19
 20Plot Types
 21----------
 22- **Distribution plots**: Kernel density estimation plots showing feature distributions
 23  grouped by task type and hand.
 24- **Trend plots**: Line plots displaying feature progression across task sequences.
 25- **Box plots**: Box-and-whisker plots comparing distributions across conditions.
 26- **Cluster heatmaps**: Hierarchically clustered heatmaps of standardized features.
 27"""
 28
 29import pathlib
 30import warnings
 31
 32import matplotlib
 33import pandas as pd
 34import seaborn as sns
 35from matplotlib import pyplot as plt
 36
 37from graphomotor.core import config
 38from graphomotor.utils import plotting
 39
 40matplotlib.use("agg")  # prevent interactive matplotlib
 41logger = config.get_logger()
 42
 43
 44def plot_feature_distributions(
 45    data: str | pathlib.Path | pd.DataFrame,
 46    output_path: str | pathlib.Path | None = None,
 47    features: list[str] | None = None,
 48) -> plt.Figure:
 49    """Plot histograms for each feature grouped by task type and hand.
 50
 51    This function creates kernel density estimation plots showing feature distributions
 52    grouped by task type (trace/recall) and hand (Dom/NonDom). The input data should
 53    be a CSV file with the first 5 columns reserved for metadata (`source_file`,
 54    `participant_id`, `task`, `hand`, `start_time`), with all subsequent columns treated
 55    as numerical features.
 56
 57    Both standard graphomotor features and custom feature columns added by users
 58    are supported. For a complete list of the 25 standard features available from
 59    the graphomotor extraction pipeline, see the
 60    [features module documentation](https://childmindresearch.github.io/graphomotor/graphomotor/features.html).
 61
 62    Args:
 63        data: Path to CSV file containing features or pandas DataFrame. Input data
 64            should have the first 5 columns as metadata (`source_file`,
 65            `participant_id`, `task`, `hand`, `start_time`) followed by numerical
 66            feature columns.
 67        output_path: Optional directory where the figure will be saved. If None,
 68            the function only returns the figure without saving.
 69        features: List of specific features to plot, if None plots all features.
 70            Can include any of the 25 standard graphomotor features (see module
 71            docstring) or custom feature columns added to the CSV file.
 72
 73    Returns:
 74        The matplotlib Figure.
 75    """
 76    logger.debug("Starting feature distributions plot generation")
 77
 78    plot_data, features, _ = plotting.prepare_feature_plot_data(data, features)
 79
 80    hands = plot_data["hand"].unique()
 81    task_types = plot_data["task_type"].unique()
 82
 83    colors = {
 84        (hand, task_type): plt.get_cmap("tab20")(i)
 85        for i, (hand, task_type) in enumerate(
 86            [(h, t) for h in hands for t in task_types]
 87        )
 88    }
 89
 90    fig, axes = plotting.init_feature_subplots(len(features))
 91    for i, feature in enumerate(features):
 92        ax = axes[i]
 93
 94        for hand in hands:
 95            for task_type in task_types:
 96                subset = plot_data[
 97                    (plot_data["hand"] == hand) & (plot_data["task_type"] == task_type)
 98                ]
 99                sns.kdeplot(
100                    data=subset,
101                    x=feature,
102                    fill=True,
103                    cut=0,
104                    alpha=0.6,
105                    color=colors[(hand, task_type)],
106                    label=f"{hand} - {task_type.capitalize()}",
107                    ax=ax,
108                )
109
110        display_name = plotting.format_feature_name(feature)
111        ax.set_title(display_name)
112        ax.set_xlabel(display_name)
113        ax.set_ylabel("Density")
114        ax.legend(title="Hand - Task Type")
115        ax.grid(alpha=0.3)
116
117    plotting.hide_extra_axes(axes=axes, num_subplots=len(features))
118
119    plt.tight_layout()
120    plt.suptitle(
121        "Feature Distributions across Task Types and Hands",
122        y=1.01,
123        fontsize=10 + len(axes) // 2,
124    )
125
126    if output_path:
127        plotting.save_figure(
128            figure=fig, output_path=output_path, filename="feature_distributions"
129        )
130    else:
131        logger.debug("Feature distributions plot generated but not saved")
132
133    return fig
134
135
136def plot_feature_trends(
137    data: str | pathlib.Path | pd.DataFrame,
138    output_path: str | pathlib.Path | None = None,
139    features: list[str] | None = None,
140) -> plt.Figure:
141    """Plot lineplots to compare feature values across conditions per participant.
142
143    This function creates line plots displaying feature progression across task
144    sequences with individual participant trajectories and group means. The input
145    data should be a CSV file with the first 5 columns reserved for metadata
146    (`source_file`, `participant_id`, `task`, `hand`, `start_time`), with all subsequent
147    columns treated as numerical features.
148
149    Both standard graphomotor features and custom feature columns added by users
150    are supported. For a complete list of the 25 standard features available from
151    the graphomotor extraction pipeline, see the
152    [features module documentation](https://childmindresearch.github.io/graphomotor/graphomotor/features.html).
153
154    Args:
155        data: Path to CSV file containing features or pandas DataFrame. Input data
156            should have the first 5 columns as metadata (`source_file`,
157            `participant_id`, `task`, `hand`, `start_time`) followed by numerical
158            feature columns.
159        output_path: Optional directory where the figure will be saved. If None,
160            the function only returns the figure without saving.
161        features: List of specific features to plot, if None plots all features.
162            Can include any of the 25 standard graphomotor features (see module
163            docstring) or custom feature columns added to the CSV file.
164
165    Returns:
166        The matplotlib Figure.
167    """
168    logger.debug("Starting feature trends plot generation")
169
170    plot_data, features, tasks = plotting.prepare_feature_plot_data(data, features)
171    logger.debug(f"Plotting trends across {len(tasks)} tasks")
172
173    fig, axes = plotting.init_feature_subplots(len(features))
174    for i, feature in enumerate(features):
175        ax = axes[i]
176        sns.lineplot(
177            data=plot_data,
178            x="task_order",
179            y=feature,
180            hue="hand",
181            units="participant_id",
182            estimator=None,
183            alpha=0.2,
184            linewidth=0.5,
185            legend=False,
186            ax=ax,
187        )
188        sns.lineplot(
189            data=plot_data,
190            x="task_order",
191            y=feature,
192            hue="hand",
193            estimator="mean",
194            errorbar=None,
195            linewidth=2,
196            marker="o",
197            markersize=4,
198            ax=ax,
199        )
200        display_name = plotting.format_feature_name(feature)
201        ax.set_title(display_name)
202        ax.set_ylabel(display_name)
203        ax.set_xlabel("Task")
204        ax.set_xticks(list(range(1, len(tasks) + 1)))
205        ax.set_xticklabels(tasks, rotation=45, ha="right")
206        ax.legend(title="Hand")
207        ax.grid(alpha=0.3)
208
209    plotting.hide_extra_axes(axes=axes, num_subplots=len(features))
210
211    plt.tight_layout()
212    plt.suptitle(
213        "Feature Trends across Tasks and Hands", y=1.01, fontsize=10 + len(axes) // 2
214    )
215
216    if output_path:
217        plotting.save_figure(
218            figure=fig, output_path=output_path, filename="feature_trends"
219        )
220    else:
221        logger.debug("Feature trends plot generated but not saved")
222
223    return fig
224
225
226def plot_feature_boxplots(
227    data: str | pathlib.Path | pd.DataFrame,
228    output_path: str | pathlib.Path | None = None,
229    features: list[str] | None = None,
230) -> plt.Figure:
231    """Plot boxplots to compare feature distributions across conditions.
232
233    This function creates box-and-whisker plots comparing feature distributions
234    across different tasks and hand conditions. The input data should be a CSV
235    file with the first 5 columns reserved for metadata (`source_file`,
236    `participant_id`, `task`, `hand`, `start_time`), with all subsequent columns treated
237    as numerical features.
238
239    Both standard graphomotor features and custom feature columns added by users
240    are supported. For a complete list of the 25 standard features available from
241    the graphomotor extraction pipeline, see the
242    [features module documentation](https://childmindresearch.github.io/graphomotor/graphomotor/features.html).
243
244    Args:
245        data: Path to CSV file containing features or pandas DataFrame. Input data
246            should have the first 5 columns as metadata (`source_file`,
247            `participant_id`, `task`, `hand`, `start_time`) followed by numerical
248            feature columns.
249        output_path: Optional directory where the figure will be saved. If None,
250            the function only returns the figure without saving.
251        features: List of specific features to plot, if None plots all features.
252            Can include any of the 25 standard graphomotor features (see module
253            docstring) or custom feature columns added to the CSV file.
254
255    Returns:
256        The matplotlib Figure.
257    """
258    logger.debug("Starting feature boxplots generation")
259
260    plot_data, features, tasks = plotting.prepare_feature_plot_data(data, features)
261    logger.debug(f"Creating boxplots across {len(tasks)} tasks")
262
263    fig, axes = plotting.init_feature_subplots(len(features))
264    for i, feature in enumerate(features):
265        ax = axes[i]
266
267        # Suppress seaborn's internal deprecation warning about 'vert' parameter
268        with warnings.catch_warnings():
269            warnings.filterwarnings(
270                "ignore",
271                category=PendingDeprecationWarning,
272                message="vert: bool will be deprecated.*",
273            )
274            sns.boxplot(
275                data=plot_data,
276                x="task",
277                y=feature,
278                hue="hand",
279                order=tasks,
280                ax=ax,
281            )
282
283        display_name = plotting.format_feature_name(feature)
284        ax.set_title(display_name)
285        ax.set_ylabel(display_name)
286        ax.set_xlabel("Task")
287        ax.set_xticks(list(range(len(tasks))))
288        ax.set_xticklabels(tasks, rotation=45, ha="right")
289        ax.legend(title="Hand")
290        ax.grid(alpha=0.3)
291
292    plotting.hide_extra_axes(axes=axes, num_subplots=len(features))
293
294    plt.tight_layout()
295    plt.suptitle(
296        "Feature Boxplots across Tasks and Hands", y=1.01, fontsize=10 + len(axes) // 2
297    )
298
299    if output_path:
300        plotting.save_figure(
301            figure=fig, output_path=output_path, filename="feature_boxplots"
302        )
303    else:
304        logger.debug("Feature boxplots generated but not saved")
305
306    return fig
307
308
309def plot_feature_clusters(
310    data: str | pathlib.Path | pd.DataFrame,
311    output_path: str | pathlib.Path | None = None,
312    features: list[str] | None = None,
313) -> plt.Figure:
314    """Plot clustered heatmap of standardized feature values across conditions.
315
316    This function creates a hierarchically clustered heatmap that visualizes the median
317    feature values across conditions. Values are z-score standardized across features to
318    allow comparison when features are on different scales. Both features and
319    conditions are hierarchically clustered to highlight groups of similar feature
320    response patterns and conditions that elicit similar profiles.
321
322    The input data should be a CSV file with the first 5 columns reserved for metadata
323    (`source_file`, `participant_id`, `task`, `hand`, `start_time`), with all subsequent
324    columns treated as numerical features. Both standard graphomotor features and custom
325    feature columns added by users are supported. For a complete list of the 25
326    standard features available from the graphomotor extraction pipeline, see the
327    [features module documentation](https://childmindresearch.github.io/graphomotor/graphomotor/features.html).
328
329    Args:
330        data: Path to CSV file containing features or pandas DataFrame. Input data
331            should have the first 5 columns as metadata (`source_file`,
332            `participant_id`, `task`, `hand`, `start_time`) followed by numerical
333            feature columns.
334        output_path: Optional directory where the figure will be saved. If None,
335            the function only returns the figure without saving.
336        features: List of specific features to plot, if None plots all features.
337            Can include any of the 25 standard graphomotor features (see module
338            docstring) or custom feature columns added to the CSV file.
339
340    Returns:
341        The matplotlib Figure.
342
343    Raises:
344        ValueError: If less than 2 features are provided.
345    """
346    logger.debug("Starting feature clusters heatmap generation")
347
348    plot_data, features, _ = plotting.prepare_feature_plot_data(data, features)
349
350    if len(features) < 2:
351        error_msg = (
352            f"At least 2 features required for clustered heatmap, got {len(features)}"
353        )
354        logger.error(error_msg)
355        raise ValueError(error_msg)
356
357    plot_data["condition"] = plot_data["task"] + "_" + plot_data["hand"]
358
359    condition_medians = plot_data.groupby("condition")[features].median()
360
361    heatmap_data = condition_medians.T
362    logger.debug(f"Heatmap data shape: {heatmap_data.shape} for (features, conditions)")
363
364    width = max(10, len(heatmap_data.columns) * 0.8)
365    height = max(6, len(heatmap_data.index) * 0.3)
366
367    grid = sns.clustermap(
368        heatmap_data,
369        z_score=0,
370        figsize=(width, height),
371        cbar_kws={
372            "label": "z-score",
373            "location": "bottom",
374            "orientation": "horizontal",
375        },
376        cbar_pos=(0.025, 0.93, 0.1 + 0.001 * width, 0.02 + 0.001 * height),
377        center=0,
378        cmap="coolwarm",
379        linewidths=0.1,
380        linecolor="black",
381    )
382
383    grid.figure.suptitle(
384        "Feature Clusters Across Conditions",
385        fontsize=14,
386        y=1.01,
387    )
388    grid.ax_heatmap.set_xlabel("Condition")
389    grid.ax_heatmap.set_ylabel("Feature")
390    grid.ax_heatmap.set_yticklabels(grid.ax_heatmap.get_yticklabels(), rotation=0)
391    grid.ax_heatmap.set_xticklabels(
392        grid.ax_heatmap.get_xticklabels(), rotation=45, ha="right"
393    )
394
395    if output_path:
396        plotting.save_figure(
397            figure=grid.figure, output_path=output_path, filename="feature_clusters"
398        )
399    else:
400        logger.debug("Feature clusters heatmap generated but not saved")
401
402    return grid.figure

logger = <Logger graphomotor (WARNING)>

 45def plot_feature_distributions(
 46    data: str | pathlib.Path | pd.DataFrame,
 47    output_path: str | pathlib.Path | None = None,
 48    features: list[str] | None = None,
 49) -> plt.Figure:
 50    """Plot histograms for each feature grouped by task type and hand.
 51
 52    This function creates kernel density estimation plots showing feature distributions
 53    grouped by task type (trace/recall) and hand (Dom/NonDom). The input data should
 54    be a CSV file with the first 5 columns reserved for metadata (`source_file`,
 55    `participant_id`, `task`, `hand`, `start_time`), with all subsequent columns treated
 56    as numerical features.
 57
 58    Both standard graphomotor features and custom feature columns added by users
 59    are supported. For a complete list of the 25 standard features available from
 60    the graphomotor extraction pipeline, see the
 61    [features module documentation](https://childmindresearch.github.io/graphomotor/graphomotor/features.html).
 62
 63    Args:
 64        data: Path to CSV file containing features or pandas DataFrame. Input data
 65            should have the first 5 columns as metadata (`source_file`,
 66            `participant_id`, `task`, `hand`, `start_time`) followed by numerical
 67            feature columns.
 68        output_path: Optional directory where the figure will be saved. If None,
 69            the function only returns the figure without saving.
 70        features: List of specific features to plot, if None plots all features.
 71            Can include any of the 25 standard graphomotor features (see module
 72            docstring) or custom feature columns added to the CSV file.
 73
 74    Returns:
 75        The matplotlib Figure.
 76    """
 77    logger.debug("Starting feature distributions plot generation")
 78
 79    plot_data, features, _ = plotting.prepare_feature_plot_data(data, features)
 80
 81    hands = plot_data["hand"].unique()
 82    task_types = plot_data["task_type"].unique()
 83
 84    colors = {
 85        (hand, task_type): plt.get_cmap("tab20")(i)
 86        for i, (hand, task_type) in enumerate(
 87            [(h, t) for h in hands for t in task_types]
 88        )
 89    }
 90
 91    fig, axes = plotting.init_feature_subplots(len(features))
 92    for i, feature in enumerate(features):
 93        ax = axes[i]
 94
 95        for hand in hands:
 96            for task_type in task_types:
 97                subset = plot_data[
 98                    (plot_data["hand"] == hand) & (plot_data["task_type"] == task_type)
 99                ]
100                sns.kdeplot(
101                    data=subset,
102                    x=feature,
103                    fill=True,
104                    cut=0,
105                    alpha=0.6,
106                    color=colors[(hand, task_type)],
107                    label=f"{hand} - {task_type.capitalize()}",
108                    ax=ax,
109                )
110
111        display_name = plotting.format_feature_name(feature)
112        ax.set_title(display_name)
113        ax.set_xlabel(display_name)
114        ax.set_ylabel("Density")
115        ax.legend(title="Hand - Task Type")
116        ax.grid(alpha=0.3)
117
118    plotting.hide_extra_axes(axes=axes, num_subplots=len(features))
119
120    plt.tight_layout()
121    plt.suptitle(
122        "Feature Distributions across Task Types and Hands",
123        y=1.01,
124        fontsize=10 + len(axes) // 2,
125    )
126
127    if output_path:
128        plotting.save_figure(
129            figure=fig, output_path=output_path, filename="feature_distributions"
130        )
131    else:
132        logger.debug("Feature distributions plot generated but not saved")
133
134    return fig

Plot histograms for each feature grouped by task type and hand.

This function creates kernel density estimation plots showing feature distributions grouped by task type (trace/recall) and hand (Dom/NonDom). The input data should be a CSV file with the first 5 columns reserved for metadata (source_file, participant_id, task, hand, start_time), with all subsequent columns treated as numerical features.

Both standard graphomotor features and custom feature columns added by users are supported. For a complete list of the 25 standard features available from the graphomotor extraction pipeline, see the features module documentation.

Arguments:

data: Path to CSV file containing features or pandas DataFrame. Input data should have the first 5 columns as metadata (source_file, participant_id, task, hand, start_time) followed by numerical feature columns.
output_path: Optional directory where the figure will be saved. If None, the function only returns the figure without saving.
features: List of specific features to plot, if None plots all features. Can include any of the 25 standard graphomotor features (see module docstring) or custom feature columns added to the CSV file.

Returns:

The matplotlib Figure.

137def plot_feature_trends(
138    data: str | pathlib.Path | pd.DataFrame,
139    output_path: str | pathlib.Path | None = None,
140    features: list[str] | None = None,
141) -> plt.Figure:
142    """Plot lineplots to compare feature values across conditions per participant.
143
144    This function creates line plots displaying feature progression across task
145    sequences with individual participant trajectories and group means. The input
146    data should be a CSV file with the first 5 columns reserved for metadata
147    (`source_file`, `participant_id`, `task`, `hand`, `start_time`), with all subsequent
148    columns treated as numerical features.
149
150    Both standard graphomotor features and custom feature columns added by users
151    are supported. For a complete list of the 25 standard features available from
152    the graphomotor extraction pipeline, see the
153    [features module documentation](https://childmindresearch.github.io/graphomotor/graphomotor/features.html).
154
155    Args:
156        data: Path to CSV file containing features or pandas DataFrame. Input data
157            should have the first 5 columns as metadata (`source_file`,
158            `participant_id`, `task`, `hand`, `start_time`) followed by numerical
159            feature columns.
160        output_path: Optional directory where the figure will be saved. If None,
161            the function only returns the figure without saving.
162        features: List of specific features to plot, if None plots all features.
163            Can include any of the 25 standard graphomotor features (see module
164            docstring) or custom feature columns added to the CSV file.
165
166    Returns:
167        The matplotlib Figure.
168    """
169    logger.debug("Starting feature trends plot generation")
170
171    plot_data, features, tasks = plotting.prepare_feature_plot_data(data, features)
172    logger.debug(f"Plotting trends across {len(tasks)} tasks")
173
174    fig, axes = plotting.init_feature_subplots(len(features))
175    for i, feature in enumerate(features):
176        ax = axes[i]
177        sns.lineplot(
178            data=plot_data,
179            x="task_order",
180            y=feature,
181            hue="hand",
182            units="participant_id",
183            estimator=None,
184            alpha=0.2,
185            linewidth=0.5,
186            legend=False,
187            ax=ax,
188        )
189        sns.lineplot(
190            data=plot_data,
191            x="task_order",
192            y=feature,
193            hue="hand",
194            estimator="mean",
195            errorbar=None,
196            linewidth=2,
197            marker="o",
198            markersize=4,
199            ax=ax,
200        )
201        display_name = plotting.format_feature_name(feature)
202        ax.set_title(display_name)
203        ax.set_ylabel(display_name)
204        ax.set_xlabel("Task")
205        ax.set_xticks(list(range(1, len(tasks) + 1)))
206        ax.set_xticklabels(tasks, rotation=45, ha="right")
207        ax.legend(title="Hand")
208        ax.grid(alpha=0.3)
209
210    plotting.hide_extra_axes(axes=axes, num_subplots=len(features))
211
212    plt.tight_layout()
213    plt.suptitle(
214        "Feature Trends across Tasks and Hands", y=1.01, fontsize=10 + len(axes) // 2
215    )
216
217    if output_path:
218        plotting.save_figure(
219            figure=fig, output_path=output_path, filename="feature_trends"
220        )
221    else:
222        logger.debug("Feature trends plot generated but not saved")
223
224    return fig

Plot lineplots to compare feature values across conditions per participant.

This function creates line plots displaying feature progression across task sequences with individual participant trajectories and group means. The input data should be a CSV file with the first 5 columns reserved for metadata (source_file, participant_id, task, hand, start_time), with all subsequent columns treated as numerical features.

Arguments:

data: Path to CSV file containing features or pandas DataFrame. Input data should have the first 5 columns as metadata (source_file, participant_id, task, hand, start_time) followed by numerical feature columns.
output_path: Optional directory where the figure will be saved. If None, the function only returns the figure without saving.
features: List of specific features to plot, if None plots all features. Can include any of the 25 standard graphomotor features (see module docstring) or custom feature columns added to the CSV file.

Returns:

The matplotlib Figure.

227def plot_feature_boxplots(
228    data: str | pathlib.Path | pd.DataFrame,
229    output_path: str | pathlib.Path | None = None,
230    features: list[str] | None = None,
231) -> plt.Figure:
232    """Plot boxplots to compare feature distributions across conditions.
233
234    This function creates box-and-whisker plots comparing feature distributions
235    across different tasks and hand conditions. The input data should be a CSV
236    file with the first 5 columns reserved for metadata (`source_file`,
237    `participant_id`, `task`, `hand`, `start_time`), with all subsequent columns treated
238    as numerical features.
239
240    Both standard graphomotor features and custom feature columns added by users
241    are supported. For a complete list of the 25 standard features available from
242    the graphomotor extraction pipeline, see the
243    [features module documentation](https://childmindresearch.github.io/graphomotor/graphomotor/features.html).
244
245    Args:
246        data: Path to CSV file containing features or pandas DataFrame. Input data
247            should have the first 5 columns as metadata (`source_file`,
248            `participant_id`, `task`, `hand`, `start_time`) followed by numerical
249            feature columns.
250        output_path: Optional directory where the figure will be saved. If None,
251            the function only returns the figure without saving.
252        features: List of specific features to plot, if None plots all features.
253            Can include any of the 25 standard graphomotor features (see module
254            docstring) or custom feature columns added to the CSV file.
255
256    Returns:
257        The matplotlib Figure.
258    """
259    logger.debug("Starting feature boxplots generation")
260
261    plot_data, features, tasks = plotting.prepare_feature_plot_data(data, features)
262    logger.debug(f"Creating boxplots across {len(tasks)} tasks")
263
264    fig, axes = plotting.init_feature_subplots(len(features))
265    for i, feature in enumerate(features):
266        ax = axes[i]
267
268        # Suppress seaborn's internal deprecation warning about 'vert' parameter
269        with warnings.catch_warnings():
270            warnings.filterwarnings(
271                "ignore",
272                category=PendingDeprecationWarning,
273                message="vert: bool will be deprecated.*",
274            )
275            sns.boxplot(
276                data=plot_data,
277                x="task",
278                y=feature,
279                hue="hand",
280                order=tasks,
281                ax=ax,
282            )
283
284        display_name = plotting.format_feature_name(feature)
285        ax.set_title(display_name)
286        ax.set_ylabel(display_name)
287        ax.set_xlabel("Task")
288        ax.set_xticks(list(range(len(tasks))))
289        ax.set_xticklabels(tasks, rotation=45, ha="right")
290        ax.legend(title="Hand")
291        ax.grid(alpha=0.3)
292
293    plotting.hide_extra_axes(axes=axes, num_subplots=len(features))
294
295    plt.tight_layout()
296    plt.suptitle(
297        "Feature Boxplots across Tasks and Hands", y=1.01, fontsize=10 + len(axes) // 2
298    )
299
300    if output_path:
301        plotting.save_figure(
302            figure=fig, output_path=output_path, filename="feature_boxplots"
303        )
304    else:
305        logger.debug("Feature boxplots generated but not saved")
306
307    return fig

Plot boxplots to compare feature distributions across conditions.

This function creates box-and-whisker plots comparing feature distributions across different tasks and hand conditions. The input data should be a CSV file with the first 5 columns reserved for metadata (source_file, participant_id, task, hand, start_time), with all subsequent columns treated as numerical features.

Arguments:

data: Path to CSV file containing features or pandas DataFrame. Input data should have the first 5 columns as metadata (source_file, participant_id, task, hand, start_time) followed by numerical feature columns.
output_path: Optional directory where the figure will be saved. If None, the function only returns the figure without saving.
features: List of specific features to plot, if None plots all features. Can include any of the 25 standard graphomotor features (see module docstring) or custom feature columns added to the CSV file.

Returns:

The matplotlib Figure.

310def plot_feature_clusters(
311    data: str | pathlib.Path | pd.DataFrame,
312    output_path: str | pathlib.Path | None = None,
313    features: list[str] | None = None,
314) -> plt.Figure:
315    """Plot clustered heatmap of standardized feature values across conditions.
316
317    This function creates a hierarchically clustered heatmap that visualizes the median
318    feature values across conditions. Values are z-score standardized across features to
319    allow comparison when features are on different scales. Both features and
320    conditions are hierarchically clustered to highlight groups of similar feature
321    response patterns and conditions that elicit similar profiles.
322
323    The input data should be a CSV file with the first 5 columns reserved for metadata
324    (`source_file`, `participant_id`, `task`, `hand`, `start_time`), with all subsequent
325    columns treated as numerical features. Both standard graphomotor features and custom
326    feature columns added by users are supported. For a complete list of the 25
327    standard features available from the graphomotor extraction pipeline, see the
328    [features module documentation](https://childmindresearch.github.io/graphomotor/graphomotor/features.html).
329
330    Args:
331        data: Path to CSV file containing features or pandas DataFrame. Input data
332            should have the first 5 columns as metadata (`source_file`,
333            `participant_id`, `task`, `hand`, `start_time`) followed by numerical
334            feature columns.
335        output_path: Optional directory where the figure will be saved. If None,
336            the function only returns the figure without saving.
337        features: List of specific features to plot, if None plots all features.
338            Can include any of the 25 standard graphomotor features (see module
339            docstring) or custom feature columns added to the CSV file.
340
341    Returns:
342        The matplotlib Figure.
343
344    Raises:
345        ValueError: If less than 2 features are provided.
346    """
347    logger.debug("Starting feature clusters heatmap generation")
348
349    plot_data, features, _ = plotting.prepare_feature_plot_data(data, features)
350
351    if len(features) < 2:
352        error_msg = (
353            f"At least 2 features required for clustered heatmap, got {len(features)}"
354        )
355        logger.error(error_msg)
356        raise ValueError(error_msg)
357
358    plot_data["condition"] = plot_data["task"] + "_" + plot_data["hand"]
359
360    condition_medians = plot_data.groupby("condition")[features].median()
361
362    heatmap_data = condition_medians.T
363    logger.debug(f"Heatmap data shape: {heatmap_data.shape} for (features, conditions)")
364
365    width = max(10, len(heatmap_data.columns) * 0.8)
366    height = max(6, len(heatmap_data.index) * 0.3)
367
368    grid = sns.clustermap(
369        heatmap_data,
370        z_score=0,
371        figsize=(width, height),
372        cbar_kws={
373            "label": "z-score",
374            "location": "bottom",
375            "orientation": "horizontal",
376        },
377        cbar_pos=(0.025, 0.93, 0.1 + 0.001 * width, 0.02 + 0.001 * height),
378        center=0,
379        cmap="coolwarm",
380        linewidths=0.1,
381        linecolor="black",
382    )
383
384    grid.figure.suptitle(
385        "Feature Clusters Across Conditions",
386        fontsize=14,
387        y=1.01,
388    )
389    grid.ax_heatmap.set_xlabel("Condition")
390    grid.ax_heatmap.set_ylabel("Feature")
391    grid.ax_heatmap.set_yticklabels(grid.ax_heatmap.get_yticklabels(), rotation=0)
392    grid.ax_heatmap.set_xticklabels(
393        grid.ax_heatmap.get_xticklabels(), rotation=45, ha="right"
394    )
395
396    if output_path:
397        plotting.save_figure(
398            figure=grid.figure, output_path=output_path, filename="feature_clusters"
399        )
400    else:
401        logger.debug("Feature clusters heatmap generated but not saved")
402
403    return grid.figure

Plot clustered heatmap of standardized feature values across conditions.

This function creates a hierarchically clustered heatmap that visualizes the median feature values across conditions. Values are z-score standardized across features to allow comparison when features are on different scales. Both features and conditions are hierarchically clustered to highlight groups of similar feature response patterns and conditions that elicit similar profiles.

The input data should be a CSV file with the first 5 columns reserved for metadata (source_file, participant_id, task, hand, start_time), with all subsequent columns treated as numerical features. Both standard graphomotor features and custom feature columns added by users are supported. For a complete list of the 25 standard features available from the graphomotor extraction pipeline, see the features module documentation.

Arguments:

data: Path to CSV file containing features or pandas DataFrame. Input data should have the first 5 columns as metadata (source_file, participant_id, task, hand, start_time) followed by numerical feature columns.
output_path: Optional directory where the figure will be saved. If None, the function only returns the figure without saving.
features: List of specific features to plot, if None plots all features. Can include any of the 25 standard graphomotor features (see module docstring) or custom feature columns added to the CSV file.

Returns:

The matplotlib Figure.

Raises:

ValueError: If less than 2 features are provided.