graphomotor.plot.feature_plots
Feature visualization functions for Graphomotor.
This module provides plotting functions for visualizing extracted features from spiral
drawing data. The plotting functions expect CSV files with the first 5 columns reserved
for metadata (source_file
, participant_id
, task
, hand
, start_time
), and treat
all subsequent columns as numerical features.
Available Features
The graphomotor toolkit extracts 25 features from spiral drawing data. For a complete list of all available features, see the features module documentation.
Custom Features
Users can add custom feature columns to their CSV files alongside the standard graphomotor features. Any additional columns after the first 5 metadata columns will be automatically detected and available for plotting.
Plot Types
- Distribution plots: Kernel density estimation plots showing feature distributions grouped by task type and hand.
- Trend plots: Line plots displaying feature progression across task sequences.
- Box plots: Box-and-whisker plots comparing distributions across conditions.
- Cluster heatmaps: Hierarchically clustered heatmaps of standardized features.
1"""Feature visualization functions for Graphomotor. 2 3This module provides plotting functions for visualizing extracted features from spiral 4drawing data. The plotting functions expect CSV files with the first 5 columns reserved 5for metadata (`source_file`, `participant_id`, `task`, `hand`, `start_time`), and treat 6all subsequent columns as numerical features. 7 8Available Features 9------------------ 10The graphomotor toolkit extracts 25 features from spiral drawing data. 11For a complete list of all available features, see the 12[features module documentation](https://childmindresearch.github.io/graphomotor/graphomotor/features.html). 13 14Custom Features 15--------------- 16Users can add custom feature columns to their CSV files alongside the standard 17graphomotor features. Any additional columns after the first 5 metadata columns 18will be automatically detected and available for plotting. 19 20Plot Types 21---------- 22- **Distribution plots**: Kernel density estimation plots showing feature distributions 23 grouped by task type and hand. 24- **Trend plots**: Line plots displaying feature progression across task sequences. 25- **Box plots**: Box-and-whisker plots comparing distributions across conditions. 26- **Cluster heatmaps**: Hierarchically clustered heatmaps of standardized features. 27""" 28 29import pathlib 30import warnings 31 32import matplotlib 33import pandas as pd 34import seaborn as sns 35from matplotlib import pyplot as plt 36 37from graphomotor.core import config 38from graphomotor.utils import plotting 39 40matplotlib.use("agg") # prevent interactive matplotlib 41logger = config.get_logger() 42 43 44def plot_feature_distributions( 45 data: str | pathlib.Path | pd.DataFrame, 46 output_path: str | pathlib.Path | None = None, 47 features: list[str] | None = None, 48) -> plt.Figure: 49 """Plot histograms for each feature grouped by task type and hand. 50 51 This function creates kernel density estimation plots showing feature distributions 52 grouped by task type (trace/recall) and hand (Dom/NonDom). The input data should 53 be a CSV file with the first 5 columns reserved for metadata (`source_file`, 54 `participant_id`, `task`, `hand`, `start_time`), with all subsequent columns treated 55 as numerical features. 56 57 Both standard graphomotor features and custom feature columns added by users 58 are supported. For a complete list of the 25 standard features available from 59 the graphomotor extraction pipeline, see the 60 [features module documentation](https://childmindresearch.github.io/graphomotor/graphomotor/features.html). 61 62 Args: 63 data: Path to CSV file containing features or pandas DataFrame. Input data 64 should have the first 5 columns as metadata (`source_file`, 65 `participant_id`, `task`, `hand`, `start_time`) followed by numerical 66 feature columns. 67 output_path: Optional directory where the figure will be saved. If None, 68 the function only returns the figure without saving. 69 features: List of specific features to plot, if None plots all features. 70 Can include any of the 25 standard graphomotor features (see module 71 docstring) or custom feature columns added to the CSV file. 72 73 Returns: 74 The matplotlib Figure. 75 """ 76 logger.debug("Starting feature distributions plot generation") 77 78 plot_data, features, _ = plotting.prepare_feature_plot_data(data, features) 79 80 hands = plot_data["hand"].unique() 81 task_types = plot_data["task_type"].unique() 82 83 colors = { 84 (hand, task_type): plt.get_cmap("tab20")(i) 85 for i, (hand, task_type) in enumerate( 86 [(h, t) for h in hands for t in task_types] 87 ) 88 } 89 90 fig, axes = plotting.init_feature_subplots(len(features)) 91 for i, feature in enumerate(features): 92 ax = axes[i] 93 94 for hand in hands: 95 for task_type in task_types: 96 subset = plot_data[ 97 (plot_data["hand"] == hand) & (plot_data["task_type"] == task_type) 98 ] 99 sns.kdeplot( 100 data=subset, 101 x=feature, 102 fill=True, 103 cut=0, 104 alpha=0.6, 105 color=colors[(hand, task_type)], 106 label=f"{hand} - {task_type.capitalize()}", 107 ax=ax, 108 ) 109 110 display_name = plotting.format_feature_name(feature) 111 ax.set_title(display_name) 112 ax.set_xlabel(display_name) 113 ax.set_ylabel("Density") 114 ax.legend(title="Hand - Task Type") 115 ax.grid(alpha=0.3) 116 117 plotting.hide_extra_axes(axes=axes, num_subplots=len(features)) 118 119 plt.tight_layout() 120 plt.suptitle( 121 "Feature Distributions across Task Types and Hands", 122 y=1.01, 123 fontsize=10 + len(axes) // 2, 124 ) 125 126 if output_path: 127 plotting.save_figure( 128 figure=fig, output_path=output_path, filename="feature_distributions" 129 ) 130 else: 131 logger.debug("Feature distributions plot generated but not saved") 132 133 return fig 134 135 136def plot_feature_trends( 137 data: str | pathlib.Path | pd.DataFrame, 138 output_path: str | pathlib.Path | None = None, 139 features: list[str] | None = None, 140) -> plt.Figure: 141 """Plot lineplots to compare feature values across conditions per participant. 142 143 This function creates line plots displaying feature progression across task 144 sequences with individual participant trajectories and group means. The input 145 data should be a CSV file with the first 5 columns reserved for metadata 146 (`source_file`, `participant_id`, `task`, `hand`, `start_time`), with all subsequent 147 columns treated as numerical features. 148 149 Both standard graphomotor features and custom feature columns added by users 150 are supported. For a complete list of the 25 standard features available from 151 the graphomotor extraction pipeline, see the 152 [features module documentation](https://childmindresearch.github.io/graphomotor/graphomotor/features.html). 153 154 Args: 155 data: Path to CSV file containing features or pandas DataFrame. Input data 156 should have the first 5 columns as metadata (`source_file`, 157 `participant_id`, `task`, `hand`, `start_time`) followed by numerical 158 feature columns. 159 output_path: Optional directory where the figure will be saved. If None, 160 the function only returns the figure without saving. 161 features: List of specific features to plot, if None plots all features. 162 Can include any of the 25 standard graphomotor features (see module 163 docstring) or custom feature columns added to the CSV file. 164 165 Returns: 166 The matplotlib Figure. 167 """ 168 logger.debug("Starting feature trends plot generation") 169 170 plot_data, features, tasks = plotting.prepare_feature_plot_data(data, features) 171 logger.debug(f"Plotting trends across {len(tasks)} tasks") 172 173 fig, axes = plotting.init_feature_subplots(len(features)) 174 for i, feature in enumerate(features): 175 ax = axes[i] 176 sns.lineplot( 177 data=plot_data, 178 x="task_order", 179 y=feature, 180 hue="hand", 181 units="participant_id", 182 estimator=None, 183 alpha=0.2, 184 linewidth=0.5, 185 legend=False, 186 ax=ax, 187 ) 188 sns.lineplot( 189 data=plot_data, 190 x="task_order", 191 y=feature, 192 hue="hand", 193 estimator="mean", 194 errorbar=None, 195 linewidth=2, 196 marker="o", 197 markersize=4, 198 ax=ax, 199 ) 200 display_name = plotting.format_feature_name(feature) 201 ax.set_title(display_name) 202 ax.set_ylabel(display_name) 203 ax.set_xlabel("Task") 204 ax.set_xticks(list(range(1, len(tasks) + 1))) 205 ax.set_xticklabels(tasks, rotation=45, ha="right") 206 ax.legend(title="Hand") 207 ax.grid(alpha=0.3) 208 209 plotting.hide_extra_axes(axes=axes, num_subplots=len(features)) 210 211 plt.tight_layout() 212 plt.suptitle( 213 "Feature Trends across Tasks and Hands", y=1.01, fontsize=10 + len(axes) // 2 214 ) 215 216 if output_path: 217 plotting.save_figure( 218 figure=fig, output_path=output_path, filename="feature_trends" 219 ) 220 else: 221 logger.debug("Feature trends plot generated but not saved") 222 223 return fig 224 225 226def plot_feature_boxplots( 227 data: str | pathlib.Path | pd.DataFrame, 228 output_path: str | pathlib.Path | None = None, 229 features: list[str] | None = None, 230) -> plt.Figure: 231 """Plot boxplots to compare feature distributions across conditions. 232 233 This function creates box-and-whisker plots comparing feature distributions 234 across different tasks and hand conditions. The input data should be a CSV 235 file with the first 5 columns reserved for metadata (`source_file`, 236 `participant_id`, `task`, `hand`, `start_time`), with all subsequent columns treated 237 as numerical features. 238 239 Both standard graphomotor features and custom feature columns added by users 240 are supported. For a complete list of the 25 standard features available from 241 the graphomotor extraction pipeline, see the 242 [features module documentation](https://childmindresearch.github.io/graphomotor/graphomotor/features.html). 243 244 Args: 245 data: Path to CSV file containing features or pandas DataFrame. Input data 246 should have the first 5 columns as metadata (`source_file`, 247 `participant_id`, `task`, `hand`, `start_time`) followed by numerical 248 feature columns. 249 output_path: Optional directory where the figure will be saved. If None, 250 the function only returns the figure without saving. 251 features: List of specific features to plot, if None plots all features. 252 Can include any of the 25 standard graphomotor features (see module 253 docstring) or custom feature columns added to the CSV file. 254 255 Returns: 256 The matplotlib Figure. 257 """ 258 logger.debug("Starting feature boxplots generation") 259 260 plot_data, features, tasks = plotting.prepare_feature_plot_data(data, features) 261 logger.debug(f"Creating boxplots across {len(tasks)} tasks") 262 263 fig, axes = plotting.init_feature_subplots(len(features)) 264 for i, feature in enumerate(features): 265 ax = axes[i] 266 267 # Suppress seaborn's internal deprecation warning about 'vert' parameter 268 with warnings.catch_warnings(): 269 warnings.filterwarnings( 270 "ignore", 271 category=PendingDeprecationWarning, 272 message="vert: bool will be deprecated.*", 273 ) 274 sns.boxplot( 275 data=plot_data, 276 x="task", 277 y=feature, 278 hue="hand", 279 order=tasks, 280 ax=ax, 281 ) 282 283 display_name = plotting.format_feature_name(feature) 284 ax.set_title(display_name) 285 ax.set_ylabel(display_name) 286 ax.set_xlabel("Task") 287 ax.set_xticks(list(range(len(tasks)))) 288 ax.set_xticklabels(tasks, rotation=45, ha="right") 289 ax.legend(title="Hand") 290 ax.grid(alpha=0.3) 291 292 plotting.hide_extra_axes(axes=axes, num_subplots=len(features)) 293 294 plt.tight_layout() 295 plt.suptitle( 296 "Feature Boxplots across Tasks and Hands", y=1.01, fontsize=10 + len(axes) // 2 297 ) 298 299 if output_path: 300 plotting.save_figure( 301 figure=fig, output_path=output_path, filename="feature_boxplots" 302 ) 303 else: 304 logger.debug("Feature boxplots generated but not saved") 305 306 return fig 307 308 309def plot_feature_clusters( 310 data: str | pathlib.Path | pd.DataFrame, 311 output_path: str | pathlib.Path | None = None, 312 features: list[str] | None = None, 313) -> plt.Figure: 314 """Plot clustered heatmap of standardized feature values across conditions. 315 316 This function creates a hierarchically clustered heatmap that visualizes the median 317 feature values across conditions. Values are z-score standardized across features to 318 allow comparison when features are on different scales. Both features and 319 conditions are hierarchically clustered to highlight groups of similar feature 320 response patterns and conditions that elicit similar profiles. 321 322 The input data should be a CSV file with the first 5 columns reserved for metadata 323 (`source_file`, `participant_id`, `task`, `hand`, `start_time`), with all subsequent 324 columns treated as numerical features. Both standard graphomotor features and custom 325 feature columns added by users are supported. For a complete list of the 25 326 standard features available from the graphomotor extraction pipeline, see the 327 [features module documentation](https://childmindresearch.github.io/graphomotor/graphomotor/features.html). 328 329 Args: 330 data: Path to CSV file containing features or pandas DataFrame. Input data 331 should have the first 5 columns as metadata (`source_file`, 332 `participant_id`, `task`, `hand`, `start_time`) followed by numerical 333 feature columns. 334 output_path: Optional directory where the figure will be saved. If None, 335 the function only returns the figure without saving. 336 features: List of specific features to plot, if None plots all features. 337 Can include any of the 25 standard graphomotor features (see module 338 docstring) or custom feature columns added to the CSV file. 339 340 Returns: 341 The matplotlib Figure. 342 343 Raises: 344 ValueError: If less than 2 features are provided. 345 """ 346 logger.debug("Starting feature clusters heatmap generation") 347 348 plot_data, features, _ = plotting.prepare_feature_plot_data(data, features) 349 350 if len(features) < 2: 351 error_msg = ( 352 f"At least 2 features required for clustered heatmap, got {len(features)}" 353 ) 354 logger.error(error_msg) 355 raise ValueError(error_msg) 356 357 plot_data["condition"] = plot_data["task"] + "_" + plot_data["hand"] 358 359 condition_medians = plot_data.groupby("condition")[features].median() 360 361 heatmap_data = condition_medians.T 362 logger.debug(f"Heatmap data shape: {heatmap_data.shape} for (features, conditions)") 363 364 width = max(10, len(heatmap_data.columns) * 0.8) 365 height = max(6, len(heatmap_data.index) * 0.3) 366 367 grid = sns.clustermap( 368 heatmap_data, 369 z_score=0, 370 figsize=(width, height), 371 cbar_kws={ 372 "label": "z-score", 373 "location": "bottom", 374 "orientation": "horizontal", 375 }, 376 cbar_pos=(0.025, 0.93, 0.1 + 0.001 * width, 0.02 + 0.001 * height), 377 center=0, 378 cmap="coolwarm", 379 linewidths=0.1, 380 linecolor="black", 381 ) 382 383 grid.figure.suptitle( 384 "Feature Clusters Across Conditions", 385 fontsize=14, 386 y=1.01, 387 ) 388 grid.ax_heatmap.set_xlabel("Condition") 389 grid.ax_heatmap.set_ylabel("Feature") 390 grid.ax_heatmap.set_yticklabels(grid.ax_heatmap.get_yticklabels(), rotation=0) 391 grid.ax_heatmap.set_xticklabels( 392 grid.ax_heatmap.get_xticklabels(), rotation=45, ha="right" 393 ) 394 395 if output_path: 396 plotting.save_figure( 397 figure=grid.figure, output_path=output_path, filename="feature_clusters" 398 ) 399 else: 400 logger.debug("Feature clusters heatmap generated but not saved") 401 402 return grid.figure
45def plot_feature_distributions( 46 data: str | pathlib.Path | pd.DataFrame, 47 output_path: str | pathlib.Path | None = None, 48 features: list[str] | None = None, 49) -> plt.Figure: 50 """Plot histograms for each feature grouped by task type and hand. 51 52 This function creates kernel density estimation plots showing feature distributions 53 grouped by task type (trace/recall) and hand (Dom/NonDom). The input data should 54 be a CSV file with the first 5 columns reserved for metadata (`source_file`, 55 `participant_id`, `task`, `hand`, `start_time`), with all subsequent columns treated 56 as numerical features. 57 58 Both standard graphomotor features and custom feature columns added by users 59 are supported. For a complete list of the 25 standard features available from 60 the graphomotor extraction pipeline, see the 61 [features module documentation](https://childmindresearch.github.io/graphomotor/graphomotor/features.html). 62 63 Args: 64 data: Path to CSV file containing features or pandas DataFrame. Input data 65 should have the first 5 columns as metadata (`source_file`, 66 `participant_id`, `task`, `hand`, `start_time`) followed by numerical 67 feature columns. 68 output_path: Optional directory where the figure will be saved. If None, 69 the function only returns the figure without saving. 70 features: List of specific features to plot, if None plots all features. 71 Can include any of the 25 standard graphomotor features (see module 72 docstring) or custom feature columns added to the CSV file. 73 74 Returns: 75 The matplotlib Figure. 76 """ 77 logger.debug("Starting feature distributions plot generation") 78 79 plot_data, features, _ = plotting.prepare_feature_plot_data(data, features) 80 81 hands = plot_data["hand"].unique() 82 task_types = plot_data["task_type"].unique() 83 84 colors = { 85 (hand, task_type): plt.get_cmap("tab20")(i) 86 for i, (hand, task_type) in enumerate( 87 [(h, t) for h in hands for t in task_types] 88 ) 89 } 90 91 fig, axes = plotting.init_feature_subplots(len(features)) 92 for i, feature in enumerate(features): 93 ax = axes[i] 94 95 for hand in hands: 96 for task_type in task_types: 97 subset = plot_data[ 98 (plot_data["hand"] == hand) & (plot_data["task_type"] == task_type) 99 ] 100 sns.kdeplot( 101 data=subset, 102 x=feature, 103 fill=True, 104 cut=0, 105 alpha=0.6, 106 color=colors[(hand, task_type)], 107 label=f"{hand} - {task_type.capitalize()}", 108 ax=ax, 109 ) 110 111 display_name = plotting.format_feature_name(feature) 112 ax.set_title(display_name) 113 ax.set_xlabel(display_name) 114 ax.set_ylabel("Density") 115 ax.legend(title="Hand - Task Type") 116 ax.grid(alpha=0.3) 117 118 plotting.hide_extra_axes(axes=axes, num_subplots=len(features)) 119 120 plt.tight_layout() 121 plt.suptitle( 122 "Feature Distributions across Task Types and Hands", 123 y=1.01, 124 fontsize=10 + len(axes) // 2, 125 ) 126 127 if output_path: 128 plotting.save_figure( 129 figure=fig, output_path=output_path, filename="feature_distributions" 130 ) 131 else: 132 logger.debug("Feature distributions plot generated but not saved") 133 134 return fig
Plot histograms for each feature grouped by task type and hand.
This function creates kernel density estimation plots showing feature distributions
grouped by task type (trace/recall) and hand (Dom/NonDom). The input data should
be a CSV file with the first 5 columns reserved for metadata (source_file
,
participant_id
, task
, hand
, start_time
), with all subsequent columns treated
as numerical features.
Both standard graphomotor features and custom feature columns added by users are supported. For a complete list of the 25 standard features available from the graphomotor extraction pipeline, see the features module documentation.
Arguments:
- data: Path to CSV file containing features or pandas DataFrame. Input data
should have the first 5 columns as metadata (
source_file
,participant_id
,task
,hand
,start_time
) followed by numerical feature columns. - output_path: Optional directory where the figure will be saved. If None, the function only returns the figure without saving.
- features: List of specific features to plot, if None plots all features. Can include any of the 25 standard graphomotor features (see module docstring) or custom feature columns added to the CSV file.
Returns:
The matplotlib Figure.
137def plot_feature_trends( 138 data: str | pathlib.Path | pd.DataFrame, 139 output_path: str | pathlib.Path | None = None, 140 features: list[str] | None = None, 141) -> plt.Figure: 142 """Plot lineplots to compare feature values across conditions per participant. 143 144 This function creates line plots displaying feature progression across task 145 sequences with individual participant trajectories and group means. The input 146 data should be a CSV file with the first 5 columns reserved for metadata 147 (`source_file`, `participant_id`, `task`, `hand`, `start_time`), with all subsequent 148 columns treated as numerical features. 149 150 Both standard graphomotor features and custom feature columns added by users 151 are supported. For a complete list of the 25 standard features available from 152 the graphomotor extraction pipeline, see the 153 [features module documentation](https://childmindresearch.github.io/graphomotor/graphomotor/features.html). 154 155 Args: 156 data: Path to CSV file containing features or pandas DataFrame. Input data 157 should have the first 5 columns as metadata (`source_file`, 158 `participant_id`, `task`, `hand`, `start_time`) followed by numerical 159 feature columns. 160 output_path: Optional directory where the figure will be saved. If None, 161 the function only returns the figure without saving. 162 features: List of specific features to plot, if None plots all features. 163 Can include any of the 25 standard graphomotor features (see module 164 docstring) or custom feature columns added to the CSV file. 165 166 Returns: 167 The matplotlib Figure. 168 """ 169 logger.debug("Starting feature trends plot generation") 170 171 plot_data, features, tasks = plotting.prepare_feature_plot_data(data, features) 172 logger.debug(f"Plotting trends across {len(tasks)} tasks") 173 174 fig, axes = plotting.init_feature_subplots(len(features)) 175 for i, feature in enumerate(features): 176 ax = axes[i] 177 sns.lineplot( 178 data=plot_data, 179 x="task_order", 180 y=feature, 181 hue="hand", 182 units="participant_id", 183 estimator=None, 184 alpha=0.2, 185 linewidth=0.5, 186 legend=False, 187 ax=ax, 188 ) 189 sns.lineplot( 190 data=plot_data, 191 x="task_order", 192 y=feature, 193 hue="hand", 194 estimator="mean", 195 errorbar=None, 196 linewidth=2, 197 marker="o", 198 markersize=4, 199 ax=ax, 200 ) 201 display_name = plotting.format_feature_name(feature) 202 ax.set_title(display_name) 203 ax.set_ylabel(display_name) 204 ax.set_xlabel("Task") 205 ax.set_xticks(list(range(1, len(tasks) + 1))) 206 ax.set_xticklabels(tasks, rotation=45, ha="right") 207 ax.legend(title="Hand") 208 ax.grid(alpha=0.3) 209 210 plotting.hide_extra_axes(axes=axes, num_subplots=len(features)) 211 212 plt.tight_layout() 213 plt.suptitle( 214 "Feature Trends across Tasks and Hands", y=1.01, fontsize=10 + len(axes) // 2 215 ) 216 217 if output_path: 218 plotting.save_figure( 219 figure=fig, output_path=output_path, filename="feature_trends" 220 ) 221 else: 222 logger.debug("Feature trends plot generated but not saved") 223 224 return fig
Plot lineplots to compare feature values across conditions per participant.
This function creates line plots displaying feature progression across task
sequences with individual participant trajectories and group means. The input
data should be a CSV file with the first 5 columns reserved for metadata
(source_file
, participant_id
, task
, hand
, start_time
), with all subsequent
columns treated as numerical features.
Both standard graphomotor features and custom feature columns added by users are supported. For a complete list of the 25 standard features available from the graphomotor extraction pipeline, see the features module documentation.
Arguments:
- data: Path to CSV file containing features or pandas DataFrame. Input data
should have the first 5 columns as metadata (
source_file
,participant_id
,task
,hand
,start_time
) followed by numerical feature columns. - output_path: Optional directory where the figure will be saved. If None, the function only returns the figure without saving.
- features: List of specific features to plot, if None plots all features. Can include any of the 25 standard graphomotor features (see module docstring) or custom feature columns added to the CSV file.
Returns:
The matplotlib Figure.
227def plot_feature_boxplots( 228 data: str | pathlib.Path | pd.DataFrame, 229 output_path: str | pathlib.Path | None = None, 230 features: list[str] | None = None, 231) -> plt.Figure: 232 """Plot boxplots to compare feature distributions across conditions. 233 234 This function creates box-and-whisker plots comparing feature distributions 235 across different tasks and hand conditions. The input data should be a CSV 236 file with the first 5 columns reserved for metadata (`source_file`, 237 `participant_id`, `task`, `hand`, `start_time`), with all subsequent columns treated 238 as numerical features. 239 240 Both standard graphomotor features and custom feature columns added by users 241 are supported. For a complete list of the 25 standard features available from 242 the graphomotor extraction pipeline, see the 243 [features module documentation](https://childmindresearch.github.io/graphomotor/graphomotor/features.html). 244 245 Args: 246 data: Path to CSV file containing features or pandas DataFrame. Input data 247 should have the first 5 columns as metadata (`source_file`, 248 `participant_id`, `task`, `hand`, `start_time`) followed by numerical 249 feature columns. 250 output_path: Optional directory where the figure will be saved. If None, 251 the function only returns the figure without saving. 252 features: List of specific features to plot, if None plots all features. 253 Can include any of the 25 standard graphomotor features (see module 254 docstring) or custom feature columns added to the CSV file. 255 256 Returns: 257 The matplotlib Figure. 258 """ 259 logger.debug("Starting feature boxplots generation") 260 261 plot_data, features, tasks = plotting.prepare_feature_plot_data(data, features) 262 logger.debug(f"Creating boxplots across {len(tasks)} tasks") 263 264 fig, axes = plotting.init_feature_subplots(len(features)) 265 for i, feature in enumerate(features): 266 ax = axes[i] 267 268 # Suppress seaborn's internal deprecation warning about 'vert' parameter 269 with warnings.catch_warnings(): 270 warnings.filterwarnings( 271 "ignore", 272 category=PendingDeprecationWarning, 273 message="vert: bool will be deprecated.*", 274 ) 275 sns.boxplot( 276 data=plot_data, 277 x="task", 278 y=feature, 279 hue="hand", 280 order=tasks, 281 ax=ax, 282 ) 283 284 display_name = plotting.format_feature_name(feature) 285 ax.set_title(display_name) 286 ax.set_ylabel(display_name) 287 ax.set_xlabel("Task") 288 ax.set_xticks(list(range(len(tasks)))) 289 ax.set_xticklabels(tasks, rotation=45, ha="right") 290 ax.legend(title="Hand") 291 ax.grid(alpha=0.3) 292 293 plotting.hide_extra_axes(axes=axes, num_subplots=len(features)) 294 295 plt.tight_layout() 296 plt.suptitle( 297 "Feature Boxplots across Tasks and Hands", y=1.01, fontsize=10 + len(axes) // 2 298 ) 299 300 if output_path: 301 plotting.save_figure( 302 figure=fig, output_path=output_path, filename="feature_boxplots" 303 ) 304 else: 305 logger.debug("Feature boxplots generated but not saved") 306 307 return fig
Plot boxplots to compare feature distributions across conditions.
This function creates box-and-whisker plots comparing feature distributions
across different tasks and hand conditions. The input data should be a CSV
file with the first 5 columns reserved for metadata (source_file
,
participant_id
, task
, hand
, start_time
), with all subsequent columns treated
as numerical features.
Both standard graphomotor features and custom feature columns added by users are supported. For a complete list of the 25 standard features available from the graphomotor extraction pipeline, see the features module documentation.
Arguments:
- data: Path to CSV file containing features or pandas DataFrame. Input data
should have the first 5 columns as metadata (
source_file
,participant_id
,task
,hand
,start_time
) followed by numerical feature columns. - output_path: Optional directory where the figure will be saved. If None, the function only returns the figure without saving.
- features: List of specific features to plot, if None plots all features. Can include any of the 25 standard graphomotor features (see module docstring) or custom feature columns added to the CSV file.
Returns:
The matplotlib Figure.
310def plot_feature_clusters( 311 data: str | pathlib.Path | pd.DataFrame, 312 output_path: str | pathlib.Path | None = None, 313 features: list[str] | None = None, 314) -> plt.Figure: 315 """Plot clustered heatmap of standardized feature values across conditions. 316 317 This function creates a hierarchically clustered heatmap that visualizes the median 318 feature values across conditions. Values are z-score standardized across features to 319 allow comparison when features are on different scales. Both features and 320 conditions are hierarchically clustered to highlight groups of similar feature 321 response patterns and conditions that elicit similar profiles. 322 323 The input data should be a CSV file with the first 5 columns reserved for metadata 324 (`source_file`, `participant_id`, `task`, `hand`, `start_time`), with all subsequent 325 columns treated as numerical features. Both standard graphomotor features and custom 326 feature columns added by users are supported. For a complete list of the 25 327 standard features available from the graphomotor extraction pipeline, see the 328 [features module documentation](https://childmindresearch.github.io/graphomotor/graphomotor/features.html). 329 330 Args: 331 data: Path to CSV file containing features or pandas DataFrame. Input data 332 should have the first 5 columns as metadata (`source_file`, 333 `participant_id`, `task`, `hand`, `start_time`) followed by numerical 334 feature columns. 335 output_path: Optional directory where the figure will be saved. If None, 336 the function only returns the figure without saving. 337 features: List of specific features to plot, if None plots all features. 338 Can include any of the 25 standard graphomotor features (see module 339 docstring) or custom feature columns added to the CSV file. 340 341 Returns: 342 The matplotlib Figure. 343 344 Raises: 345 ValueError: If less than 2 features are provided. 346 """ 347 logger.debug("Starting feature clusters heatmap generation") 348 349 plot_data, features, _ = plotting.prepare_feature_plot_data(data, features) 350 351 if len(features) < 2: 352 error_msg = ( 353 f"At least 2 features required for clustered heatmap, got {len(features)}" 354 ) 355 logger.error(error_msg) 356 raise ValueError(error_msg) 357 358 plot_data["condition"] = plot_data["task"] + "_" + plot_data["hand"] 359 360 condition_medians = plot_data.groupby("condition")[features].median() 361 362 heatmap_data = condition_medians.T 363 logger.debug(f"Heatmap data shape: {heatmap_data.shape} for (features, conditions)") 364 365 width = max(10, len(heatmap_data.columns) * 0.8) 366 height = max(6, len(heatmap_data.index) * 0.3) 367 368 grid = sns.clustermap( 369 heatmap_data, 370 z_score=0, 371 figsize=(width, height), 372 cbar_kws={ 373 "label": "z-score", 374 "location": "bottom", 375 "orientation": "horizontal", 376 }, 377 cbar_pos=(0.025, 0.93, 0.1 + 0.001 * width, 0.02 + 0.001 * height), 378 center=0, 379 cmap="coolwarm", 380 linewidths=0.1, 381 linecolor="black", 382 ) 383 384 grid.figure.suptitle( 385 "Feature Clusters Across Conditions", 386 fontsize=14, 387 y=1.01, 388 ) 389 grid.ax_heatmap.set_xlabel("Condition") 390 grid.ax_heatmap.set_ylabel("Feature") 391 grid.ax_heatmap.set_yticklabels(grid.ax_heatmap.get_yticklabels(), rotation=0) 392 grid.ax_heatmap.set_xticklabels( 393 grid.ax_heatmap.get_xticklabels(), rotation=45, ha="right" 394 ) 395 396 if output_path: 397 plotting.save_figure( 398 figure=grid.figure, output_path=output_path, filename="feature_clusters" 399 ) 400 else: 401 logger.debug("Feature clusters heatmap generated but not saved") 402 403 return grid.figure
Plot clustered heatmap of standardized feature values across conditions.
This function creates a hierarchically clustered heatmap that visualizes the median feature values across conditions. Values are z-score standardized across features to allow comparison when features are on different scales. Both features and conditions are hierarchically clustered to highlight groups of similar feature response patterns and conditions that elicit similar profiles.
The input data should be a CSV file with the first 5 columns reserved for metadata
(source_file
, participant_id
, task
, hand
, start_time
), with all subsequent
columns treated as numerical features. Both standard graphomotor features and custom
feature columns added by users are supported. For a complete list of the 25
standard features available from the graphomotor extraction pipeline, see the
features module documentation.
Arguments:
- data: Path to CSV file containing features or pandas DataFrame. Input data
should have the first 5 columns as metadata (
source_file
,participant_id
,task
,hand
,start_time
) followed by numerical feature columns. - output_path: Optional directory where the figure will be saved. If None, the function only returns the figure without saving.
- features: List of specific features to plot, if None plots all features. Can include any of the 25 standard graphomotor features (see module docstring) or custom feature columns added to the CSV file.
Returns:
The matplotlib Figure.
Raises:
- ValueError: If less than 2 features are provided.