📊 MVPA Decoding & Cross‑Validation in fMRI

Key idea: Decoding neural data is a regression problem; classification is simply a thresholded regression.
Take‑home: Use beta‑series regression (LSA/LSS) → logistic regression → AUC → cross‑validation (leave‑one‑subject out).

Executive Summary

The lecture explains how to transform fMRI time‑series into single‑trial neural activity estimates (beta‑series), use those estimates to predict behavioral choices, and evaluate the predictive model with proper performance metrics (AUC) and robust cross‑validation. It also covers common pitfalls such as multicollinearity, over‑fitting, and the need for data centering.

Key Takeaways

GLM flowchart: Stimulus → HRF convolution → GLM → β‑coefficients (encoding) vs β‑coefficients → neural activity → behavioral prediction (decoding).
Beta‑series regression
- LSA (Least Squares All): one regressor per trial → fast, but suffers from collinearity.
- LSS (Least Squares Separate): one GLM per trial → slower, but more robust to collinearity.
Linear Probability Model (LPM): OLS regression where Y ∈ {0,1}; predictions are probabilities.
Logistic regression: transforms LPM output into [0,1] via the logistic function.
Accuracy is misleading when class frequencies differ; AUC is a better, threshold‑free metric.
Cross‑validation:
- Hold‑out (70/30 split) – simple but sensitive to split.
- k‑fold – repeated k‑fold gives a more stable estimate.
- Leave‑one‑subject‑out (LOSO) is the gold standard for group decoding.
Data formatting: convert 3‑D time‑series (subjects × voxels × time) into a long format (rows = trials × voxels × subjects).
Demeaning: subtract run‑wise means to reduce run‑level bias; avoid demeaning Y when class balance matters.
Multicollinearity & over‑fitting: high‑dimensional voxel sets lead to extreme β values; reducing dimensionality or regularizing (e.g., ridge, LASSO) helps.

Detailed Summary

1. Encoding vs Decoding

Encoding: predict neural response from known stimuli (GLM).
Decoding: infer stimuli/behaviors from neural data (reverse GLM).
Both use the same GLM machinery; decoding flips the direction of inference.

2. Single‑Trial Estimation (Beta‑Series)

Goal: estimate β for each trial → “neural activity” per event.
Method:
- Create a regressor matrix where each column is a single‑trial HRF.
- Fit GLM → β coefficients = trial‑wise activity.
Approaches
- LSA: one GLM with 29 regressors (fast).
- LSS: one GLM per trial (slow but more robust).

3. Regression Models for Binary Outcomes

Linear Probability Model (LPM):
- Y = Xβ + ε, Y ∈ {0,1}.
- Coefficients unbiased (BLUE) but predictions can fall outside [0,1].
Logistic Regression:
- Apply logistic function σ(Xβ) → bounded probabilities.
- Equivalent to LPM with a logit link.

4. Performance Metrics

Accuracy: fraction of correct predictions; inflated if class imbalance.
AUC (Area Under ROC):
- Threshold‑free; always 0.5 for random chance.
- Computed by ranking predicted probabilities against true labels.
- Preferred for imbalanced data.

5. Cross‑Validation Strategies

Scheme	Description	Pros	Cons
Hold‑out	70/30 split	Simple	Sensitive to split
Repeated hold‑out	Multiple random 70/30 splits	Reduces variance	Still one‑way
k‑fold	Split into k folds, train on k‑1, test on 1	Balanced training/testing	Requires k GLMs
Leave‑one‑subject‑out (LOSO)	Train on all but one subject, test on left‑out	Gold standard for group decoding	Computationally heavy

LOSO is recommended for decoding across subjects because it tests generalization to new participants.

6. Data Preparation

Long format: one row per trial × voxel × subject.
Columns: subject, trial, voxel, beta, choice.
Enables vectorized operations and straightforward cross‑validation indexing.

7. Practical Workflow (Python / MATLAB)

Load data: 3‑D array (time × voxels × subjects).
Convert to long format.
Compute beta‑series (LSA or LSS).
Fit LPM on training set → obtain β.
Predict probabilities on test set → logistic transform if desired.
Compute AUC per left‑out subject.
Average AUC across folds → final performance estimate.

8. Common Pitfalls & Remedies

Issue	Symptom	Fix
Multicollinearity	Extreme β magnitudes, unstable predictions	Reduce dimensionality, use ridge/LASSO, or LSS
Over‑fitting	High training accuracy, low test AUC	Cross‑validate, regularize, limit voxel count
Class imbalance	Accuracy ≈ majority class	Use AUC, re‑sample, or class‑weighting
Run‑level bias	Systematic shifts across runs	Demean X per run; keep Y raw for binary outcome
Non‑independent observations	Inflated test statistics	Use permutation tests or mixed‑effects models

9. Take‑away Messages

Decoding is regression: treat binary outcomes as probabilities.
AUC beats accuracy when classes are imbalanced.
Cross‑validation matters: choose a scheme that reflects the generalization you care about.
Data formatting is critical: long format simplifies modeling and validation.
Regularization is essential when using many voxels to avoid multicollinearity and over‑fitting.

“The only way to know if a pattern is real is to see it in other data sets.” – Emphasis on out‑of‑sample validation.

Decoding fMRI: LSA vs LSS