Benchmark Results

Validated on held-out data

Every metric is computed on data the model never saw during training. We report both successes and limitations.

Request Demo
0.74
Uncertainty-Gated C-index
Green-tier external (N=229)
0.91
PDX Trajectory R²
Model vs. Biology
0.96
Proliferation r
vs MKI67
<5ms
Inference
Per patient
0.704
Global C-index
0.74
Uncertainty-Gated C-index
0.91
PDX Trajectory R²
0.00%
Physics Violations
STATIC VALIDATION (TCGA)

Survival prediction

Stratified C-index: 0.670

Within-indication ranking accuracy on held-out TCGA data. Proves we rank patients within cancer types, not just between them.

Uncertainty-Gated C-index: 0.74

On Green-tier external CPTAC cohort (N=229) where ISS exceeds threshold. DRO-trained model on validated subset.

33 cancer types

Trained and validated across all major TCGA cohorts for pan-cancer applicability.

Benchmark vs. Standard of Care

C-index comparison on held-out TCGA pan-cancer cohort

Random Chance
0.50
Cox PH (SoC)
0.62
DNAI (Intent-to-Treat)
0.704
DNAI (Uncertainty-Gated)
0.74

+27% improvement over Cox Proportional Hazards on high-confidence predictions. DNAI's epistemic uncertainty calibration identifies when predictions are reliable.

Representation quality

Statistical Orthogonality: R² < 0.001

Proliferation and context subspaces are statistically independent, enabling clean biological interpretation.

Biological validity: r = 0.95

Proliferation latent correlates strongly with MKI67 expression, validating biological meaning.

Reconstruction: r > 0.85

High-fidelity reconstruction across all input modalities.

DYNAMIC VALIDATION (PDX)

Trajectory simulation

PDX Validation R²: 0.91

Model vs. Biology: Physics parameters learned from PDX (patient-derived xenograft) growth curves accurately predict real tumor dynamics.

Emulator Fidelity: 0.997

Math vs. Math: Learned trajectory emulator matches numerical ODE solver, enabling <5ms inference.

Speed: <5ms per trajectory

400-1000x faster than numerical solver, enabling real-time treatment optimization.

Note: We do not validate trajectories on TCGA (snapshot data) to avoid temporal paradoxes. PDX data provides true longitudinal measurements.

Known limitations

Training data scope

Trained primarily on TCGA (9,393 patients, 33 cancer types). DRO training improves cross-site generalization, but performance on rare cancers or non-standard sample preparation may vary.

Research use only

Not approved for clinical decision-making. Intended for research and pilot deployments.

External validation: CPTAC C-index 0.718

Validated on 1,031 patients across 10 independent CPTAC cohorts (never seen during training). DRO-trained model achieves pooled C-index 0.718 [0.684, 0.750], with 7/9 cohorts above random. Additional external datasets: CGGA (970 glioma), SCAN-B (3,069 breast).

See it in action

Schedule a demo to explore DNAI with your own data.

Request Demo