| Title: | Semi-Parametric Association Surfaces for Joint Longitudinal-Survival Models |
|---|---|
| Description: | Implements interpretable multi-biomarker fusion in joint longitudinal-survival models via semi-parametric association surfaces. Provides a two-stage estimation framework where Stage 1 fits mixed-effects longitudinal models and extracts Best Linear Unbiased Predictors ('BLUP's), and Stage 2 fits transition-specific penalized Cox models with tensor-product spline surfaces linking latent biomarker summaries to transition hazards. Supports multi-state disease processes with transition-specific surfaces, Restricted Maximum Likelihood ('REML') smoothing parameter selection, effective degrees of freedom ('EDF') diagnostics, dynamic prediction of transition probabilities, and three interpretability visualizations (surface plots, contour heatmaps, marginal effect slices). Methods are described in Bhattacharjee (2025, under review). |
| Authors: | Atanu Bhattacharjee [aut, cre] (ORCID: <https://orcid.org/0000-0002-5757-5513>) |
| Maintainer: | Atanu Bhattacharjee <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 0.1.0 |
| Built: | 2026-05-27 08:59:32 UTC |
| Source: | https://github.com/cran/jmSurface |
The jmSurface package implements interpretable multi-biomarker fusion in joint longitudinal-survival models via semi-parametric association surfaces for multi-state disease processes.
jmSurfFit the two-stage joint model
fit_longitudinalStage 1: fit longitudinal submodels
fit_gam_coxStage 2: fit GAM-Cox with tensor-product surface
edf_diagnosticsExtract EDF and complexity diagnostics
dynPredDynamic prediction of transition probabilities
plot_surface3D association surface visualization
contour_heatmapContour heatmap of danger zones
marginal_slicesMarginal effect slice plots
simulate_jmSurfaceSimulate multi-state data
run_shiny_appLaunch interactive Shiny dashboard
Maintainer: Atanu Bhattacharjee [email protected] (ORCID)
Bhattacharjee, A. (2025). Interpretable Multi-Biomarker Fusion in Joint Longitudinal-Survival Models via Semi-Parametric Association Surfaces.
Extracts subject-specific BLUPs from fitted lme models and computes
the latent trajectory at specified time points.
compute_blup_eta(lme_fits, patient_ids, times, markers = NULL)compute_blup_eta(lme_fits, patient_ids, times, markers = NULL)
lme_fits |
Named list of |
patient_ids |
Numeric vector of patient IDs. |
times |
Numeric vector of evaluation times. |
markers |
Character vector of biomarker names. If |
Data frame with columns patient_id, time, and one
eta_* column per biomarker.
Produces a filled contour (heatmap) of the estimated association surface, identifying "danger zones" of elevated transition risk.
contour_heatmap(object, transition, n_grid = 50, col = NULL, main = NULL, ...)contour_heatmap(object, transition, n_grid = 50, col = NULL, main = NULL, ...)
object |
A |
transition |
Character string specifying which transition. |
n_grid |
Integer grid resolution. Default |
col |
Color palette function. Default |
main |
Title. If |
... |
Additional arguments passed to |
Invisibly returns the prediction grid.
Computes personalized dynamic predictions of transition probabilities
from a fitted jmSurface model. Given a patient's biomarker
history up to a landmark time, projects the latent trajectories forward
and integrates the transition-specific hazard to obtain cumulative
transition probabilities.
dynPred(object, patient_id, landmark = 0, horizon = 3, n_points = 60)dynPred(object, patient_id, landmark = 0, horizon = 3, n_points = 60)
object |
A |
patient_id |
Integer patient identifier. |
landmark |
Numeric landmark time (predict from this time). |
horizon |
Numeric prediction horizon (predict this many years ahead). |
n_points |
Integer number of time points for the prediction grid.
Default |
The conditional transition probability is computed as:
where is the BLUP-projected trajectory and the integral
is approximated via the Breslow estimator.
Data frame with columns:
time |
Absolute time points |
risk |
Cumulative transition probability |
hazard |
Instantaneous hazard at each time point |
transition |
Transition name |
to_state |
Target state |
patient_id |
Patient identifier |
landmark |
Landmark time used |
Extracts EDF, deviance explained, and complexity diagnostics for each transition-specific association surface. EDF near 1 indicates linearity; EDF > 3 indicates substantial nonlinearity/interaction.
edf_diagnostics(object)edf_diagnostics(object)
object |
A |
The EDF is computed as
and represents the realized complexity of the association surface after REML-based penalization. Interpretation:
EDF ~ 1: Surface effectively linear; standard parametric JM suffices
1 < EDF <= 3: Moderate nonlinearity
EDF > 3: Substantial nonlinearity and/or interaction effects
Data frame with columns:
transition |
Transition name |
edf |
Effective degrees of freedom of the surface smooth |
deviance_explained |
Proportion of deviance explained |
n_obs |
Number of observations |
n_events |
Number of events |
complexity |
Character label: "Linear", "Moderate", or "Nonlinear" |
p_value |
Approximate p-value for the smooth term |
Fits a penalized Cox proportional hazards model with a tensor-product
spline surface for the latent biomarker summaries using mgcv::gam
with family = cox.ph().
fit_gam_cox( data, covariates = c("age_baseline", "sex"), k_marginal = c(5, 5), k_additive = 6, bs = "tp", method = "REML" )fit_gam_cox( data, covariates = c("age_baseline", "sex"), k_marginal = c(5, 5), k_additive = 6, bs = "tp", method = "REML" )
data |
Data frame from |
covariates |
Character vector of covariate names. |
k_marginal |
Integer vector of marginal basis dimensions. Default |
k_additive |
Integer basis dimension for additive smooth of third
biomarker. Default |
bs |
Spline basis type. Default |
method |
Smoothing method. Default |
A mgcv::gam object, or NULL on failure.
Stage 1 of the two-stage estimation: fits a random intercept-slope model
for each biomarker using nlme::lme.
fit_longitudinal(long_data, markers = NULL, verbose = TRUE)fit_longitudinal(long_data, markers = NULL, verbose = TRUE)
long_data |
Data frame with columns |
markers |
Character vector of biomarker names to fit. If |
verbose |
Logical; print progress. Default |
Named list of nlme::lme objects, one per biomarker.
Failed fits are NULL.
Two-stage estimation framework for multi-state joint models with tensor-product spline association surfaces. Stage 1 fits mixed-effects longitudinal models and extracts BLUPs. Stage 2 fits transition-specific penalized Cox models with tensor-product spline surfaces via REML.
jmSurf( long_data, surv_data, transitions = NULL, covariates = c("age_baseline", "sex"), k_marginal = c(5, 5), k_additive = 6, bs = "tp", method = "REML", min_events = 10, verbose = TRUE )jmSurf( long_data, surv_data, transitions = NULL, covariates = c("age_baseline", "sex"), k_marginal = c(5, 5), k_additive = 6, bs = "tp", method = "REML", min_events = 10, verbose = TRUE )
long_data |
Data frame of longitudinal biomarker measurements.
Required columns: |
surv_data |
Data frame of survival/transition events.
Required columns: |
transitions |
Character vector of transitions to model (e.g.,
|
covariates |
Character vector of baseline covariate names present in
|
k_marginal |
Integer vector of length 1 or 2 giving marginal basis
dimensions for the tensor-product spline. Default |
k_additive |
Integer giving the basis dimension for the additive
smooth of the third biomarker (if present). Default |
bs |
Character string for the spline basis type. Default |
method |
Smoothing parameter estimation method. Default |
min_events |
Integer minimum number of events required to fit a
transition model. Default |
verbose |
Logical; print progress messages. Default |
The model for each transition is:
where is a semi-parametric association surface represented via
tensor-product splines, and are BLUP-based latent
longitudinal summaries evaluated at the midpoint of each sojourn interval.
An object of class "jmSurface" containing:
lme_fits |
Named list of |
gam_fits |
Named list of |
eta_data |
Named list of analysis data frames with latent summaries |
transitions |
Character vector of fitted transitions |
biomarkers |
Character vector of biomarker names |
covariates |
Character vector of covariate names used |
edf |
Named numeric vector of EDF values per transition |
deviance_explained |
Named numeric of deviance explained per transition |
call |
The matched call |
Bhattacharjee, A. (2025). Interpretable Multi-Biomarker Fusion in Joint Longitudinal-Survival Models via Semi-Parametric Association Surfaces.
Bhattacharjee, A. (2024). jmBIG: Scalable Joint Models for Big Data.
Wood, S.N. (2017). Generalized Additive Models: An Introduction with R. Chapman & Hall/CRC.
Tsiatis, A.A. & Davidian, M. (2004). Joint modeling of longitudinal and time-to-event data: an overview. Statistica Sinica, 14, 809-834.
# Simulate data sim <- simulate_jmSurface(n_patients = 300) # Fit the joint model fit <- jmSurf( long_data = sim$long_data, surv_data = sim$surv_data, covariates = c("age_baseline", "sex") ) # Summary with EDF diagnostics summary(fit) # Dynamic prediction for patient 1 pred <- dynPred(fit, patient_id = 1, landmark = 2, horizon = 3) # Visualize surfaces plot_surface(fit, transition = "CKD -> CVD") contour_heatmap(fit, transition = "CKD -> CVD") marginal_slices(fit, transition = "CKD -> CVD")# Simulate data sim <- simulate_jmSurface(n_patients = 300) # Fit the joint model fit <- jmSurf( long_data = sim$long_data, surv_data = sim$surv_data, covariates = c("age_baseline", "sex") ) # Summary with EDF diagnostics summary(fit) # Dynamic prediction for patient 1 pred <- dynPred(fit, patient_id = 1, landmark = 2, horizon = 3) # Visualize surfaces plot_surface(fit, transition = "CKD -> CVD") contour_heatmap(fit, transition = "CKD -> CVD") marginal_slices(fit, transition = "CKD -> CVD")
Convenience function to load the bundled CKD/CVD/Diabetes multi-state
cohort data (2,000 patients, 3 biomarkers, 9 transitions) from the
CSV files shipped with the package. Returns both the longitudinal
biomarker data and survival event data in a list, ready for use
with jmSurf.
The longitudinal data contains 68,112 biomarker measurements (eGFR, BNP, HbA1c) and the survival data contains 4,701 rows covering 9 transition types across CKD, CVD, Diabetes, and At-risk states.
load_example_data()load_example_data()
A list with two data frames:
long_data |
Longitudinal biomarker measurements (68,112 rows, 5 columns:
|
surv_data |
Survival/transition events (4,701 rows, 17 columns including
|
dat <- load_example_data() str(dat$long_data) str(dat$surv_data) # Fit a model directly fit <- jmSurf(dat$long_data, dat$surv_data, verbose = FALSE) summary(fit)dat <- load_example_data() str(dat$long_data) str(dat$surv_data) # Fit a model directly fit <- jmSurf(dat$long_data, dat$surv_data, verbose = FALSE) summary(fit)
Produces marginal effect slice plots showing the effect of one biomarker on the log-hazard at fixed quantiles (Q25, Q50, Q75) of the other. Diverging slices indicate interaction; parallel slices indicate additivity.
marginal_slices( object, transition, n_points = 60, quantiles = c(0.25, 0.5, 0.75), colors = c("#264653", "#e76f51", "#2a9d8f"), main = NULL, ... )marginal_slices( object, transition, n_points = 60, quantiles = c(0.25, 0.5, 0.75), colors = c("#264653", "#e76f51", "#2a9d8f"), main = NULL, ... )
object |
A |
transition |
Character string specifying which transition. |
n_points |
Integer number of evaluation points. Default |
quantiles |
Numeric vector of quantile probabilities for slicing.
Default |
colors |
Character vector of colors for each slice. Default blue/orange/red. |
main |
Title. If |
... |
Additional arguments passed to |
Invisibly returns the data frame of slice values.
Produces a 3D perspective plot of the estimated semi-parametric association
surface for a specified transition.
plot_surface( object, transition, n_grid = 40, theta = -30, phi = 25, col = NULL, main = NULL, ... )plot_surface( object, transition, n_grid = 40, theta = -30, phi = 25, col = NULL, main = NULL, ... )
object |
A |
transition |
Character string specifying which transition to plot. |
n_grid |
Integer grid resolution. Default |
theta, phi
|
Viewing angles for |
col |
Color palette. Default |
main |
Title. If |
... |
Additional arguments passed to |
Invisibly returns the prediction grid with fitted values.
Default plot dispatches to plot_surface for the
first transition.
## S3 method for class 'jmSurface' plot(x, transition = NULL, type = c("surface", "heatmap", "slices"), ...)## S3 method for class 'jmSurface' plot(x, transition = NULL, type = c("surface", "heatmap", "slices"), ...)
x |
A |
transition |
Which transition to plot. Default: first. |
type |
One of |
... |
Additional arguments. |
Invisibly returns the prediction grid (a data frame) produced by
the dispatched plotting function (plot_surface,
contour_heatmap, or marginal_slices).
Brief overview of a fitted jmSurface model.
## S3 method for class 'jmSurface' print(x, ...)## S3 method for class 'jmSurface' print(x, ...)
x |
A |
... |
Ignored. |
The input object x, returned invisibly. Called for its
side effect of printing a brief model overview to the console.
Launches the interactive Shiny dashboard for personalized multi-state joint modeling with semi-parametric association surfaces. The app provides data upload, exploration, model fitting, personalized prediction, and surface visualization.
run_shiny_app(...)run_shiny_app(...)
... |
Additional arguments passed to |
The Shiny app requires additional packages: shiny, shinydashboard,
shinyWidgets, ggplot2, viridis, plotly, dplyr,
tidyr, DT, gridExtra.
No return value, called for the side effect of launching an interactive Shiny application in the user's default web browser.
Generates realistic simulated data for a multi-state chronic disease cohort (CKD/CVD/Diabetes) with three longitudinal biomarkers (eGFR, BNP, HbA1c) and bidirectional transitions. Includes demographic covariates and realistic biomarker trajectories.
simulate_jmSurface( n_patients = 500, max_followup = 15, max_events = 3, seed = 42 )simulate_jmSurface( n_patients = 500, max_followup = 15, max_events = 3, seed = 42 )
n_patients |
Integer number of patients. Default |
max_followup |
Numeric maximum follow-up time in years. Default |
max_events |
Integer maximum number of events per patient. Default |
seed |
Integer random seed. Default |
A list with two data frames:
long_data |
Longitudinal biomarker measurements with columns
|
surv_data |
Survival/transition data with columns
|
sim <- simulate_jmSurface(n_patients = 200, seed = 123) head(sim$long_data) table(sim$surv_data$transition)sim <- simulate_jmSurface(n_patients = 200, seed = 123) head(sim$long_data) table(sim$surv_data$transition)
Provides a comprehensive summary of the fitted joint model including longitudinal submodel parameters, transition-specific surface EDF and deviance explained, and model configuration.
## S3 method for class 'jmSurface' summary(object, ...)## S3 method for class 'jmSurface' summary(object, ...)
object |
A |
... |
Ignored. |
Invisibly returns a list of summary components.