Decoding fMRI: LSA vs LSS
Shared on March 6, 2026
📊 MVPA Decoding & Cross‑Validation in fMRI
Key idea: Decoding neural data is a regression problem; classification is simply a thresholded regression.
Take‑home: Use beta‑series regression (LSA/LSS) → logistic regression → AUC → cross‑validation (leave‑one‑subject out).
Executive Summary
The lecture explains how to transform fMRI time‑series into single‑trial neural activity estimates (beta‑series), use those estimates to predict behavioral choices, and evaluate the predictive model with proper performance metrics (AUC) and robust cross‑validation. It also covers common pitfalls such as multicollinearity, over‑fitting, and the need for data centering.
Key Takeaways
- GLM flowchart: Stimulus → HRF convolution → GLM → β‑coefficients (encoding) vs β‑coefficients → neural activity → behavioral prediction (decoding).
- Beta‑series regression
- LSA (Least Squares All): one regressor per trial → fast, but suffers from collinearity.
- LSS (Least Squares Separate): one GLM per trial → slower, but more robust to collinearity.
- Linear Probability Model (LPM): OLS regression where Y ∈ {0,1}; predictions are probabilities.
- Logistic regression: transforms LPM output into [0,1] via the logistic function.
- Accuracy is misleading when class frequencies differ; AUC is a better, threshold‑free metric.
- Cross‑validation:
- Hold‑out (70/30 split) – simple but sensitive to split.
- k‑fold – repeated k‑fold gives a more stable estimate.
- Leave‑one‑subject‑out (LOSO) is the gold standard for group decoding.
- Data formatting: convert 3‑D time‑series (subjects × voxels × time) into a long format (rows = trials × voxels × subjects).
- Demeaning: subtract run‑wise means to reduce run‑level bias; avoid demeaning Y when class balance matters.
- Multicollinearity & over‑fitting: high‑dimensional voxel sets lead to extreme β values; reducing dimensionality or regularizing (e.g., ridge, LASSO) helps.
Detailed Summary
1. Encoding vs Decoding
- Encoding: predict neural response from known stimuli (GLM).
- Decoding: infer stimuli/behaviors from neural data (reverse GLM).
- Both use the same GLM machinery; decoding flips the direction of inference.
2. Single‑Trial Estimation (Beta‑Series)
- Goal: estimate β for each trial → “neural activity” per event.
- Method:
- Create a regressor matrix where each column is a single‑trial HRF.
- Fit GLM → β coefficients = trial‑wise activity.
- Approaches
- LSA: one GLM with 29 regressors (fast).
- LSS: one GLM per trial (slow but more robust).
3. Regression Models for Binary Outcomes
- Linear Probability Model (LPM):
- Y = Xβ + ε, Y ∈ {0,1}.
- Coefficients unbiased (BLUE) but predictions can fall outside [0,1].
- Logistic Regression:
- Apply logistic function σ(Xβ) → bounded probabilities.
- Equivalent to LPM with a logit link.
4. Performance Metrics
- Accuracy: fraction of correct predictions; inflated if class imbalance.
- AUC (Area Under ROC):
- Threshold‑free; always 0.5 for random chance.
- Computed by ranking predicted probabilities against true labels.
- Preferred for imbalanced data.
5. Cross‑Validation Strategies
| Scheme | Description | Pros | Cons |
|---|---|---|---|
| Hold‑out | 70/30 split | Simple | Sensitive to split |
| Repeated hold‑out | Multiple random 70/30 splits | Reduces variance | Still one‑way |
| k‑fold | Split into k folds, train on k‑1, test on 1 | Balanced training/testing | Requires k GLMs |
| Leave‑one‑subject‑out (LOSO) | Train on all but one subject, test on left‑out | Gold standard for group decoding | Computationally heavy |
- LOSO is recommended for decoding across subjects because it tests generalization to new participants.
6. Data Preparation
- Long format: one row per trial × voxel × subject.
- Columns:
subject,trial,voxel,beta,choice. - Enables vectorized operations and straightforward cross‑validation indexing.
7. Practical Workflow (Python / MATLAB)
- Load data: 3‑D array (
time × voxels × subjects). - Convert to long format.
- Compute beta‑series (LSA or LSS).
- Fit LPM on training set → obtain β.
- Predict probabilities on test set → logistic transform if desired.
- Compute AUC per left‑out subject.
- Average AUC across folds → final performance estimate.
8. Common Pitfalls & Remedies
| Issue | Symptom | Fix |
|---|---|---|
| Multicollinearity | Extreme β magnitudes, unstable predictions | Reduce dimensionality, use ridge/LASSO, or LSS |
| Over‑fitting | High training accuracy, low test AUC | Cross‑validate, regularize, limit voxel count |
| Class imbalance | Accuracy ≈ majority class | Use AUC, re‑sample, or class‑weighting |
| Run‑level bias | Systematic shifts across runs | Demean X per run; keep Y raw for binary outcome |
| Non‑independent observations | Inflated test statistics | Use permutation tests or mixed‑effects models |
9. Take‑away Messages
- Decoding is regression: treat binary outcomes as probabilities.
- AUC beats accuracy when classes are imbalanced.
- Cross‑validation matters: choose a scheme that reflects the generalization you care about.
- Data formatting is critical: long format simplifies modeling and validation.
- Regularization is essential when using many voxels to avoid multicollinearity and over‑fitting.
“The only way to know if a pattern is real is to see it in other data sets.” – Emphasis on out‑of‑sample validation.