# ML Processing Summary (2026-01-03) ## Goal Build a fast, day-to-day classifier to predict hereditary vs inflammatory demyelinating neuropathy, without using center as a model feature, and with clear leakage checks. ## Inputs and labels - Input dataset: `output/final_dataset/final_dataset_labeled.jsonl`. - Records in file: 408; used for ML: 408 (0 excluded). - Label policy: - hereditary = neuropathy_type_encoded 0..3 - inflammatory/paraneoplastic = 4..6 ## Feature policy (performance + transparency) Center is never used as a feature. Age and sensory absence are kept with explicit caveats. Final feature set (11): - age_at_exam_years - sex_encoded - sensory_responses_absent_encoded - median_distal_latency_ms (any-side) - median_ncv_mean_ms (any-side) - median_cmap_amplitude_mean_mv (any-side) - ulnar_distal_latency_ms (any-side) - ulnar_ncv_mean_ms (any-side) - ulnar_cmap_amplitude_mean_mv (any-side) - median_ulnar_cmap_amplitude_ratio (derived) - median_ulnar_ncv_diff_gt_10 (derived) Any-side means: use left if present, else right. ## Pre-processing and robustness steps - Winsorization (1-99 percentile) to reduce outliers. - Median imputation for missing values. - Standardization only for linear/SVM models. - Center+label sample weights for training (center not used as a feature). ## Leakage and bias checks (summary) - Permutation (labels shuffled) balanced accuracy ~0.52-0.55: no leakage signal. - Center-only baseline balanced accuracy ~0.778: strong center confound. - Group-CV by center (leave-one-center-out) balanced accuracy ~0.59-0.70, AUROC ~0.73-0.84: generalization to unseen centers remains limited. ## Chosen option for daily use (best overall performance) Model: ExtraTrees (500 trees) with center-balanced sample weights. - Stratified 5-fold CV (in-distribution): - accuracy 0.816 - balanced accuracy 0.812 - AUROC 0.890 - average precision 0.846 - confusion matrix: TN 193, FP 34, FN 41, TP 140 Important limitation: - For a new center not represented in training, expected performance is lower. - Age is informative but shows class/center imbalance; monitor missingness and center effects. ## Final model training + congress figures (2026-01-03) Final model training and figure generation were executed for congress outputs: - Output directory: `output/final_model_2026-01-03/` - Model file: `output/final_model_2026-01-03/extra_trees_model.joblib` - Leave-one-center-out results (balanced accuracy / AUROC): - HCFMRP_USP: 0.661 / 0.766 - UFU: 0.696 / 0.839 - USP_SP: 0.589 / 0.732 - legacy_hcrp: 0.688 / 0.749 - Humanitas_Milano excluded from center CV because only one class is present. Figures saved for the congress: - `knowledge_base/presentations/2026_neuropathy_class_distribution.png` - `knowledge_base/presentations/2026_neuropathy_roc_curve.png` - `knowledge_base/presentations/2026_neuropathy_precision_recall.png` - `knowledge_base/presentations/2026_neuropathy_confusion_matrix.png` - `knowledge_base/presentations/2026_neuropathy_feature_importance.png` - `knowledge_base/presentations/2026_neuropathy_groupcv_by_center.png` - `knowledge_base/presentations/2026_neuropathy_ml_summary.pdf` ## Artifacts (full reproducibility) - ML run: `output/ml_benchmark_2026-01-03_age_sensory_with_derived/` - `ml_dataset_hereditary_vs_inflammatory.csv` - `ml_run_manifest.json` - `model_metrics_summary.csv` - Leakage audit: `output/ml_leakage_audit_2026-01-03_age_sensory_with_derived/` - `leakage_audit_manifest.json` - `group_cv_by_center.json` - `single_feature_auc.csv` - Age confound checks: `output/age_checks_2026-01-03/` - Final model: `output/final_model_2026-01-03/` - Model training script: `analysis/train_calculator_model.py`