Reproducible Statistical Computing
Tools, frameworks, and practices for reproducible research in biostatistics
2026-06-07 18:16 PDT
Overview
Reproducibility is not a stylistic preference but a scientific requirement: an analysis that cannot be independently rerun, verified, and modified is not a completed analysis. In practice, achieving computational reproducibility in clinical research requires attention to software environment management, data provenance, workflow automation, and the tooling that connects each step of the analysis pipeline from raw data to rendered report.
This program develops and maintains a suite of R packages and command-line tools that operationalize reproducible workflows for biostatistical research. The central organizing framework is zzcollab, which instantiates a Docker-based, renv-pinned research compendium from a single command. The surrounding tools handle specific tasks – longitudinal visualization, Table 1 construction, power analysis, electronic data capture, and output formatting – that arise repeatedly across projects.
Tools
zzcollab – Docker-based reproducible research compendium framework. Creates a complete project scaffold (Dockerfile,
renv.lock,.Rprofile, source code, data directory) from a single CLI command, targeting five research profiles (minimal, analysis, modeling, publishing, shiny).zzedc – Electronic data capture system for clinical trials, providing form design, validation, and data export in a reproducible R-based pipeline.
zzrenvcheck – Validation tool for
renvpackage dependency graphs, detecting mismatches betweenrenv.lockand the active R library.zzlongplot – Longitudinal data visualization for clinical trials, with MMRM-style trajectory displays and individual-profile overlays.
zztable1 – Next-generation Table 1 construction for clinical research, supporting multi-format output (LaTeX, HTML, plain text) from a single function call.
zzobj2fig – Renders any R modeling output as a publication-quality LaTeX or Typst table.
zzworld – WORLD-backwards cognitive test scoring and edit-distance analysis, implementing five scoring rules from the MMSE literature.
zzfisher – Fisher’s exact test for r×2 contingency tables with exact power analysis.
zzpower – Interactive power analysis calculator for clinical-trial designs.
nof1power – Power analysis and simulation for N-of-1 and parallel-group trial designs.
zzgit – Interactive
git add/commit/pushfor zsh with Conventional Commits wizard and secret scanning.zzvim-R – Vim/Neovim plugin for R integration: send code to an R session from the editor.
Current research
WORLD-backwards scoring: empirical comparison. Multi-cohort comparison of five WORLD-backwards scoring rules applied to seven ADCS and ADNI MMSE datasets, validating the
zzworldimplementation against legacy Perl, SAS, C, and R reference implementations. Whitepaper near submission.zzcollab framework paper. Methods paper describing the five-pillar zzcollab architecture (Dockerfile,
renv.lock,.Rprofile, source code, data) and its application to clinical research compendia.zzedc methods paper. Description and evaluation of the
zzedcelectronic data capture system for clinical trials.
Publications
Work at the intersection of statistical computing and methodology is accessible through the full publications list by filtering on statistical-computing or data-visualization.