Open Science

Our Research

Peer-reviewed publications, patent portfolio, and technical documentation underlying the DNAI physics-constrained cancer digital twin platform.

14 Publications15 Patent Applications3 Technical Reports248,000+ Patients Across 9 External Cohorts
Preprints & Publications

Research Publications

Original research from the DNAI project. All preprints are freely available for download. Journal submissions in progress.

Domain GeneralizationPreprintPatent 6: Distributionally Robust Training

Domain Shift is a Feature, Not a Bug: Distributionally Robust Optimization Outperforms Harmonization in Clinical Foundation Models

Feb 2026

We challenge the harmonization paradigm, demonstrating that site-specific variance encodes critical prognostic information. Group DRO with Pooled Cox achieves C=0.718 on CPTAC, outperforming all baselines including ComBat and CORAL.

CPTAC external C-index: 0.718Green-tier C-index: 0.744 (22.2% coverage)8 external cohorts, 236,000+ patients
Treatment OptimizationPreprintPatent 7: Risk-Averse Stochastic Optimization

The Plural Twin: Quantifying Treatment Policy Stability via Set-Valued Cancer Digital Twins

Feb 2026

We introduce Plural Twins, a set-valued framework where each patient is represented as a distribution of outcomes. 82.9% of patients show policy instability; for 1 in 6, the optimal treatment depends on the algorithm's risk tolerance.

1,000 MC dropout realizations per patientCVaR-optimal concordance: 914d vs 763d (p=3.5e-10)16.7% CVaR vs mean discordance
Transfer LearningPreprintPatent 1: Sim-to-Real TransferPatent 2: Collapse Prevention

The Sim-to-Real Scaling Paradox: Biological Heterogeneity Reverses Transfer Learning Gains in Cancer Digital Twins

Feb 2026

We report a counterintuitive data scaling paradox: expanding PDX training data from 128 to 573 samples degrades clinical prediction. Multi-cancer alignment erases biology-preserving variance through shortcut domain adaptation.

128-PDX outperforms 573-PDX (Strat C: 0.654 vs 0.640)573-PDX domain accuracy: 0.807 (fails alignment)5 joint training variants evaluated
Uncertainty QuantificationPreprintPatent 8: Runtime Gating & Solver Interlock

Runtime Reliability Labeling for Safe Deployment of Oncology AI Under Distribution Shift

Feb 2026

A unified framework for certifying when a clinical AI prediction is reliable enough for decision-making. Per-patient transportability certificates, structured abstention, and evidence-completion recommendations.

20.1% clinical-grade coverage (C>0.6)3-tier triage: STABLE / CONTESTED / CHAOTICICI calibration: 0.0094 (isotonic)
Clinical Decision SupportPreprintPatent 7: Risk-Averse Stochastic Optimization

The Glass Cannon Phenotype: High Predicted Benefit but Low Robustness to Biological and Dosage Perturbations

Feb 2026

We identify a clinically actionable phenotype defined by the intersection of high predicted treatment benefit and low robustness to perturbation. These patients (7.0%) show median OS of 478d with 71.1% event rate — the worst outcomes despite high expected benefit.

4 quadrants: Solid / Glass Cannon / Stable / FragileFragility AUC for early failure: 0.78830.9% of treatment plans change under fragility adjustment
Causal InferencePreprint

Practical Identifiability Failure in Physics-Constrained Cancer Models: The Therapeutic Controllability Index

Feb 2026

We report a fundamental identifiability failure: drug sensitivity (beta) occupies 0.14% of its allowed range under survival-only supervision. Rather than treating this as a defect, we introduce the Therapeutic Controllability Index to quantify treatment authority per patient.

Beta/rho sensitivity ratio: 0.0017TCI C-index for survival: 0.735CONTROLLABLE vs DETERMINED: 1153d vs 489d (p=4.3e-55)
Clonal EvolutionPreprint

Clonal Architecture-Aware Digital Twins: Bridging VAF-Based Deconvolution with Physics-Constrained Tumor Simulation

Mar 2026

A complete pipeline bridging VAF-based clonal deconvolution with physics-constrained tumor simulation. Introduces the Resistance Sentinel strategy that preserves clinically critical minor subclones in fixed-compartment ODE models, and knowledge-grounded per-clone drug sensitivity from OncoKB/CIViC replacing under-identified learned parameters.

72/72 tests, 6-expert validatedResistance Sentinel: auto-promotes minor resistant clonesKnowledge-grounded drug sensitivity (30+ associations)
Methods / ArchitecturePreprint

Latent-Space Clonal Decomposition: Attention-Weighted Variational Inference for Per-Subpopulation Biological State Estimation

Mar 2026

A novel architecture extending variational autoencoders to jointly decompose bulk tumor omics into per-subpopulation biological states in structured latent space. Unlike mutation-only deconvolution, LSCD infers per-clone pathway activities, growth rates, and epigenetic states — resolving the under-identification of drug sensitivity parameters.

Deconvolution in biological function space (novel)Clone count as model inference, not hyperparameterResolves beta under-identification (beta/rho=0.0017)
Validation ProtocolPreprint

Resistance Forecasting via Clonal Digital Twins: A Retrospective Validation Protocol

Mar 2026

A pre-registered validation protocol for testing clone-aware digital twin predictions against longitudinal patient data. Compares predicted resistance emergence timing and dominant resistant clone identity against observed outcomes in patients with serial biopsies from GLASS, TRACERx, and GENIE BPC cohorts.

Primary: dominant clone concordance at progressionCohorts: GLASS (222), TRACERx, GENIE BPCProtocol paper — pre-registration design
Methods / ArchitecturePreprint

Complementary Foundation Model Distillation for Multi-Omics Cancer Digital Twins

Mar 2026

We distill domain-specific foundation models (Geneformer, ESM-2) into the structured latent space of a multi-omics VAE, achieving R^2=0.774 on full latent reconstruction. The complementary distillation strategy preserves biological interpretability while incorporating protein structure and gene regulatory knowledge.

Geneformer to z_pathway: R^2=0.731Combined distillation: R^2=0.774Preserves structured latent interpretability
Drug CombinationsPreprintPatent 13: Drug Combination Efficacy Prediction

Zero-Shot Drug Combination Discovery via Orthogonal Pathway Sensitivity Profiles

Mar 2026

We predict drug combination efficacy from monotherapy data alone using pathway sensitivity orthogonality. Validated on 1,209 GDSC drug pairs, the method achieves Spearman rho=0.800 between predicted and actual combination patterns, enabling zero-shot discovery of synergistic combinations without combination screens.

Predicted vs actual combination rho=0.800 (p<0.0001)1,209 drug pairs validated from GDSCZero-shot: no combination training data required
Methods / BenchmarkingPreprint

Identity Memorization in Drug Synergy Prediction: A Cautionary Analysis of Cell-Line Overlap

Mar 2026

We demonstrate that standard drug synergy benchmarks suffer from identity memorization: models learn cell-line fingerprints rather than genuine drug interaction biology. Cell-line-holdout evaluation reveals dramatic performance drops compared to random splits, calling for stricter evaluation protocols in combination prediction.

Random-split vs cell-holdout performance gap exposedCell-line identity leakage quantified across benchmarksStrict evaluation protocol proposed
Identifiability / MethodsPreprintPatent 15: Cross-Domain Parameter AnchoringPatent 16: Runtime Execution Controller

Treatment-Gated Gradient Isolation Reduces Prognostic Leakage in Neural ODE Survival Modeling

Apr 2026

We demonstrate that mechanistic Neural ODEs trained on observational survival data suffer from a fundamental identifiability failure: drug response parameters are either excluded from the computation graph or hijacked as prognostic proxies. Our V4.1 architecture enforces untreated gradient isolation via treatment-gated graph surgery, hard biological parameterization, and two-stage PDX system identification, achieving causally identified drug sensitivity (Spearman rho=0.416 vs RECIST, p<0.0001) without sacrificing survival prediction (C-index 0.742).

Jacobian = 0.0 exactly for untreated patients (structural guarantee)PDX-validated drug sensitivity: rho=0.416 vs RECIST responseC-index 0.742 maintained with causally identified parameters
Domain GeneralizationPreprintPatent 6: Distributionally Robust Training

Data Diversity Outperforms Feature Injection for Cross-Institutional Cancer Survival Prediction

Apr 2026

We demonstrate that the within-cancer survival prediction ceiling (C-index 0.57) is caused by institutional domain shift, not feature poverty. Discrete-time hazard modeling (DeepHit) with multi-institutional Group DRO across 6 environments (17,526 patients) improved the held-out CGGA within-cancer C-index from 0.548 to 0.630. Foundation Model injection (-0.043) and proteomics LUPI (-0.041) both degraded external generalization. Adding diverse training cohorts (+0.045) was 10× more effective than adding features.

CGGA held-out within-cancer C: 0.548 → 0.630 (+0.082)Feature injection degrades: FM -0.043, Proteomics -0.04117,526 patients, 6 institutions, OpenPedCan val C=0.727
Intellectual Property

Patent Portfolio

13 U.S. Provisional Applications filed. Covering physics-constrained simulation, domain adaptation, uncertainty quantification, runtime safety, risk-averse optimization, knowledge-grounded annotation, and drug combination prediction.

Filed

#1: Sim-to-Real Transfer

Physics-Constrained Sim-to-Real Transfer Learning

Jan 25, 2026Read
Filed

#2: Collapse Prevention

Preventing Metabolic Scaling-Induced Collapse

Feb 2, 2026Read
Filed

#3: PESD & Missing Modalities

Uncertainty-Calibrated Missing Modality Imputation

Feb 2, 2026Read
Filed

#4: Pathway-Aware GRL

Ontology-Guided Autogradient Modulation

Feb 23, 2026Read
Filed

#5: ExecutionToken & Solver Interlock

Adjoint Sensitivity & Physics-Constrained Gradient Topologies

Feb 23, 2026Read
Filed

#6: Group DRO for Digital Twins

Distributionally Robust Training (DRO)

Feb 23, 2026Read
Filed

#7: Risk-Averse Stochastic Optimization

Stabilized Stochastic Inference and Risk-Averse Optimization in Physics-Constrained Oncology Digital Twins

Feb 26, 2026Read
Filed

#8: Runtime Gating & Solver Interlock

Cryptographically Enforced Runtime Resource Gating in Differential Equation Solvers via Memory Allocation Interlock

Feb 26, 2026Read
Filed

#9: Fail-Closed Integrity Gating

System and Method for Stabilized Stochastic Inference and Fail-Closed Integrity Gating in Neural Networks

Apr 4, 2026Read
Filed

#10: Knowledge-Grounded Parameter Annotation

Systems and Methods for Knowledge-Grounded Parameter Annotation and Execution Control of Computational Dynamics Simulators

Apr 4, 2026Read
Filed

#11: Safety Assurance & Execution Control

System and Method for Cryptographically Enforced Execution Mode and Automated Safety Assurance of Calibrated Machine Learning Pipelines

Apr 4, 2026Read
Filed

#12: Resistance Sentinel Architecture

System and Method for GPU Execution Control via Privilege-Separated Resistance Sentinel Architecture in Fixed-Compartment Tumor History Modeling

Apr 4, 2026Read
Filed

#13: Drug Combination Efficacy Prediction

System and Method for Predicting Drug Combination Efficacy Using Symmetric Bilinear Interaction of Monotherapy-Derived Pathway Sensitivity Profiles

Apr 4, 2026Read
Filed

#15: Cross-Domain Parameter Anchoring

System and Method for Cross-Domain Interventional Parameter Anchoring for Mechanistic Digital Twin Calibration

Apr 12, 2026Read
Filed

#16: Runtime Execution Controller

Runtime Execution Controller for Differentiable Simulators with Autodiff-Verified Parameter Exclusion and Fail-Closed Enforcement

Apr 12, 2026Read

Validated Performance

Key metrics across internal and external cohorts

0.704
Path A C-index
Internal (TCGA)
0.718
DRO External
CPTAC (1,031)
0.744
Green Tier
22.2% coverage
0.729
CGGA External
Glioma (485 held-out)
0.009
Calibration ICI
Isotonic
0.735
TCI C-index
Controllability
+151d
CVaR Survival
Concordant vs not
333
Patent Claims
15 applications

Research Use Only

All publications and methods described here are for research purposes only and have not been cleared or approved by any regulatory authority for clinical use. The DNAI platform is not a medical device. Patent applications are U.S. Provisional Applications; no patents have been granted.

Interested in collaborating?

We welcome academic collaborations, validation partnerships, and licensing discussions.