Back to Whitepapers
Technical White PaperH-BDVAE v5.10

DNAI: The First Physics-Constrained Digital Twin for Oncology

A foundational technical overview of the Neuro-Symbolic architecture powering precision oncology simulation.

January 2026
Model Version: H-BDVAE v5.10 (Production)
1

Executive Summary: The "Glass Box" Revolution

Artificial Intelligence in oncology is currently facing a crisis of trust. Standard "Black Box" Deep Learning models are powerful pattern matchers, but they are statistically fragile. They suffer from Temporal Decoupling—predicting tumor shrinkage (phenotype) while simultaneously predicting a rise in resistance markers (genotype)—a biological contradiction that destroys clinical confidence.

DNAI (Digital Anterior Neuro-oncology Atlas) introduces a paradigm shift: The Neuro-Symbolic Digital Twin.

Instead of choosing between Deep Learning (Perception) and Differential Equations (Reasoning), DNAI fuses them into a single, end-to-end differentiable architecture. At its core is the H-BDVAE v5.10, a foundation model that achieves statistical orthogonality between biological identity and tumor growth signals, solving the "Latent Collapse" problem plaguing standard architectures.

The Hybrid Engine Architecture

DNAI automatically routes data through the optimal pipeline based on input source, enabling both high-accuracy human predictions and robust cross-species translation.

Path A
The Specialist (v3.1)
InputHuman Multi-Omics
Latent Dimension328d
C-index0.704
Optimized forAccuracy

Full multi-modal encoding (RNA + DNA + Methylation + CNV) for maximum predictive power on human clinical data.

Path B
The Translator (DSN Pipeline)
InputPDX RNA-seq
Latent Dimension201d → DSN → 281d
C-index0.687
Optimized forRobustness

Domain Separation Network strips mouse stroma signal, enabling cross-species transfer from PDX to human predictions.

Uncertainty-Gated Predictions
0.74
C-index (Green tier)
0.91
PDX Trajectory R²
<0.1%
Physics Violations

When epistemic uncertainty is low, the Hybrid Engine achieves exceptional predictive accuracy across both human and PDX data sources.

End-to-End Data Pipeline

The following diagram illustrates the complete DNAI platform architecture, showing the flow from raw multi-modal patient data through the perception engine, hypernetwork fusion, and dual-path reasoning system to final physics-constrained clinical predictions.

Preprocessing
WGS/WESDeepSomaticCADDDNA Mut
Input Data
Molecular (Omics)
RNA-seq
DNA Mut
CNV
Methylation
Imaging
WSI (Pathology)
CT Radiology
FOUNDATION
H-BDVAE v5.10
Output
z_bio (328-dim)
Orthogonality
R² < 0.001
Determines: Growth Rate (ρ)
IMAGING
UNI / Virchow Encoders
WSI Output
z_wsi (1536-dim)
CT Output
z_rad (128-dim)
Determines: Carrying Capacity (K)
LATE FUSION v3.1
Gated Hypernetwork v3.2
Fuses z_bio + z_wsi + z_rad → Physics Parameters
C-index
0.704 / 0.74
Physics
100%
Triangulated
3/3 Pass
ρ (growth)β (drug)K (capacity)ω (immune)N₀ (initial)σ (noise)
V1 Static Path
V2 Dynamic Path
CausalDriver-GAT
GATv2 on PPI Graph
AUROC 0.95Top-10: 0.85
TxResponse
Concept Bottleneck Model
50 conceptsSpearman ρ = 0.72
Driver Genes
Drug Sensitivity
Neural ODE
Dopri5 + Adjoint Backprop
PDX R² 0.91<50ms inference
EvoSim
Euler-Maruyama SDE Solver
Stochastic σPhylo Acc 0.89
Trajectories
Clone Evolution
Clinical Predictions
Survival (C=0.704)
Treatment Ranking
RECIST Forecast
Clone Evolution
Dose Optimization
Foundation (VAE)
Imaging Encoders
Late Fusion
Physics (V2)

Key Innovation: End-to-End Differentiability

The entire pipeline is end-to-end differentiable, allowing gradients to flow from clinical outcomes back through the physics simulation to the multi-modal encoders. This enables joint optimization of perception and reasoning—a capability unique to the DNAI platform that traditional modular systems cannot achieve.

2

The Perception Engine: H-BDVAE v5.10

The foundation of the DNAI platform is the Hierarchical Biologically Disentangled Variational Autoencoder (H-BDVAE) v5.10. This model ingests high-dimensional multi-omics data (RNA, DNA, CNV, Methylation) and compresses it into a robust, interpretable latent representation (z_bio).

2.1 Solving "Latent Collapse"

Standard VAEs in oncology suffer from posterior collapse, where the model ignores subtle epigenetic signals in favor of strong proliferation markers (RNA). DNAI v5.10 solves this via a proprietary Additive Decoder Architecture, forcing the model to learn distinct latent distributions for "Identity" vs. "Growth."

Key Performance Metrics (v5.10)

MetricDNAI v5.10Standard VAE (SOTA)Implication
Biologic Purity (R²)< 0.0010.15 - 0.40Statistical Orthogonality. Zero leakage between tumor identity and growth rate.
Proliferation Capture (ρ_pred)0.960.70 - 0.85High correlation with ground-truth growth markers (MKI67).
Epigenetic Variance (z_meth)0.607~0 (Collapsed)Signal Recovery. Full resolution of methylation-driven resistance (e.g., MGMT).

2.2 Why Orthogonality Matters

Without It

The model finds a "BRCA1 mutation" but conflates it with fast growth signals. It cannot distinguish a driver from a fast-growing passenger.

With It

DNAI isolates the causal driver of disease identity independent of the current growth rate, allowing for accurate counterfactual simulation.

3

The Simulator: Neural ODEs & Physics

DNAI utilizes a Neuro-Symbolic approach where the Deep Learning encoder parameterizes a rigorous physical equation.

The Extended Lotka-Volterra Equation

We treat the tumor as a dynamic population governed by three forces: Intrinsic Growth, Therapy Decay, and Immune Clearance.

Governing Equation
dN/dt = ρN(1 - N/K) - βD(t)N - ωI(t)N
ρ (Growth)K (Capacity)β (Therapy)ω (Immune)
ρ (Growth)

Derived from z_bio.

β (Therapy)

A function of Pharmacokinetics and Intrinsic Sensitivity (IC50).

ω (Immune)

A non-linear clearance term modulated by checkpoint saturation (PD-L1/CTLA-4).

Constraints

Mass conservation and non-negativity enforced via ReLU guardrails. The model cannot hallucinate negative tumor volume.

4

Grounding Biology in Physics: Late Fusion v3.1

A purely molecular model is blind to the physical constraints of the tumor microenvironment. DNAI v3.1 introduces Gated Late Fusion to integrate Radiology (CT) and Pathology (WSI).

Data SourceLatent CodePhysical ParameterMeaning
Omics (RNA/DNA)z_bioGrowth Rate (ρ)How fast the cells want to divide.
Radiology (CT)z_radCarrying Capacity (K)How much space and nutrient supply exists.

The Result

A simulation that models complex spatial phenomena like necrotic cores (where V → K) and vascular limitations.

5

The Sim-to-Real Bridge

How do we simulate human trajectories without longitudinal human data?

DNAI employs Split-Source Transfer Learning to bridge the data gap:

ρ

Intrinsic Growth

Learned from dense PDX (Mouse) time-series data with daily tumor measurements.

Source: Patient-Derived Xenografts
ω

Immune Dynamics

Learned strictly from Human I/O Trials (immunotherapy outcomes, checkpoint response).

Source: Human Clinical Trials

The Bridge

Fused via Unsupervised Domain Adaptation (UDA) to align species feature spaces.

Method: Domain-Adversarial Training
6

Validated Performance

We strictly separate Static Validation (Outcomes) from Dynamic Validation (Physics) to ensure methodological rigor.

STATIC VALIDATION (TCGA Held-Out)
0.704
Global C-index

vs. SoC 0.62. Proves ranking accuracy within cancer types.

0.74
Uncertainty-Gated C-index

On external CPTAC Green-tier subset (N=229) where ISS exceeds threshold.

DYNAMIC VALIDATION (PDX Held-Out)
0.91
Trajectory Fit (R²)

High fidelity to real biological growth curves.

<0.1%
Physics Violations

Admissible state constraints (non-negativity) rigorously enforced.

7

Conclusion

DNAI represents the convergence of high-dimensional data science and classical biological physics.

By solving the latent collapse problem with v5.10 and enforcing physical constraints through Neural ODEs, we have created a platform that is safe, interpretable, and causally valid. It transforms the practice of oncology from a series of static snapshots into a continuous, optimizing movie.

The Bottom Line

DNAI doesn't just predict outcomes. It simulates the biological laws driving them—guaranteeing predictions that are mathematically consistent with human physiology.

Ready to see DNAI in action?

Schedule a demo with our team.