Bidirectional Cross-Scale Attention Fusion
Lobe Ranger (MOPFN) integrates global tissue architecture (20x) and detailed cell cytology (40x) to mimic a pathologist's workflow of zooming in and out under the microscope.
Shared ViT-B Backbone
Uses a shared Vision Transformer (ViT-B/16) backbone with frozen feature extraction to keep the optimization surface focused on fusion and task heads.
Multi-Scale Image Pairing
Patient-wise matches 20x macro-structural and 40x micro-cytological tissue patches, capturing multi-resolution spatial features without heavy downsampling.
Bidirectional Attention
Enables information exchange between scales: the 20x branch queries the 40x branch for fine cytologic details, and the 40x branch queries 20x for wider contextual layout.
Hierarchical Ordinal Heads
Multi-task heads predict binary malignancy, subtype (masked for normal), and ordinal differentiation grade using Consistent Ordinal Regression (CORAL).
Explainable AI (XAI) & Interpretability
The repository includes both saliency maps and token-level cross-scale attention inspection so qualitative analysis can be regenerated alongside the predictive pipeline.
Quantitative Evaluation Results
These values come from the latest regenerated 5-epoch local run using `outputs/improved_train_metrics.json` and `outputs/improved_eval_metrics.json` after correcting the split construction and pair explosion.
| Task | Metric | Value |
|---|---|---|
| Malignancy | Accuracy | 90.11% |
| Malignancy | F1 | 94.41% |
| Malignancy | ROC-AUC | 0.9956 |
| Subtype | Accuracy | 64.47% |
| Subtype | Macro F1 | 61.54% |
| Differentiation | Accuracy | 40.79% |
| Differentiation | MAE | 0.592 |
Caveat: this corrected split is much healthier than the earlier 142-vs-3 subtype imbalance, but it is still a 5-patient held-out set. Subtype now looks plausible, while ordinal performance remains unstable and should be cross-validated before publication.