rgtlab
  • Home
  • About
  • Publications
  • Blog
  • Teaching
Categories
All (37)
aws (2)
checklist (1)
ci (1)
clinical-trials (2)
cross-validation (1)
data-cleaning (1)
data-visualization (4)
docker (3)
dropbox (1)
git (3)
github-actions (1)
javascript (2)
julia (1)
linux (2)
lua (1)
machine-learning (2)
macos (4)
metaprogramming (2)
migration (1)
model-selection (2)
neovim (1)
obs (1)
package-development (1)
packaging (1)
penguins-arc (2)
python (4)
quarto (6)
r (26)
r-language (2)
r-packages (1)
random-forest (2)
regression (4)
renv (1)
reproducibility (12)
rmarkdown (2)
screencast (1)
setup (1)
shell (6)
shell-and-git (1)
shiny (5)
sync (1)
teaching (1)
testing (2)
testthat (1)
tinytest (1)
vim (3)
workflow (2)
workflow-construct (3)
youtube (1)
zzcollab-compendia (5)

Blog

Writing on statistical computing, reproducible research, and R

Writing on statistical computing, reproducible research, clinical-trial methodology, and open-source R software. Earlier posts are archived at focusonr.org.

Functional Plot Generation with purrr

r
data-visualization
I did not really know how to programmatically generate multiple plots from grouped data until I discovered purrr’s map2 and pmap functions – this post walks through the approach step by step using Palmer Penguins.
Jun 8, 2026
13 min

Setting Up Multi-Language Quarto Documents on macOS

quarto
r
python
julia
A practical guide to the configuration plumbing required to render a multi-language Quarto document from scratch on macOS.
Jun 8, 2026
16 min

Rapid Conversion of Draft R Scripts to Formal Rmd Reports

r
rmarkdown
reproducibility
I did not really know how to quickly convert a working R script into a presentable report until I discovered knitr::spin() and a few supporting workflows that changed how I share analytical results.
Jun 8, 2026
16 min

Install Linux Mint on a MacBook Air

linux
macos
r
python
workflow-construct
A practical guide to installing Linux Mint 22 on a 2016 MacBook Air, transforming aging Apple hardware into a functional data science workstation.
Jun 8, 2026
18 min

Provisioning AWS EC2 Instances: Console and CLI Methods

aws
shell
shiny
docker
Setting up an AWS EC2 instance from scratch, with two parallel paths: a console walkthrough for understanding the components, and four bash scripts for repeatable CLI automation.
May 17, 2026
20 min

Research Backup Architecture: Ongoing System and GitHub Archival

git
shell
macos
reproducibility
A unified treatment of research backup architecture: the three-tier ongoing system (automated Git pushes, cloud sync, and Time Machine) and the bulk GitHub archival procedure for migrating 400+ private repositories to local storage with verified backups and selective deletion.
May 17, 2026
46 min

Extending the R-Vim Workflow: LaTeX Integration and Dynamic Snippets

vim
r
python
workflow-construct
A complete configuration guide for the Vim-based R and LaTeX workflow: vimtex for LaTeX compilation, ALE for linting, UltiSnips for static snippet expansion, and UltiSnips Python interpolation for dynamic, parametric snippets.
May 17, 2026
23 min

Sharing R Code via Docker: R Markdown Reports and Shiny Applications

r
docker
rmarkdown
shiny
reproducibility
I did not really know how fragile sharing R code could be until my colleague spent an afternoon debugging missing packages, and I realised Docker could have prevented every single error – whether the output was a static PDF or a live Shiny app.
May 17, 2026
23 min

Migrating Off Dropbox: Beyond Dotfiles

workflow
sync
migration
dropbox
Post 24 establishes a portable dotfiles repository. This post extends the same goal to the rest of a single-user research workflow: project content, backup-pipeline source paths, and the append-only history files that Dropbox handles particularly badly. Frames the problem as three layers and walks through trade-offs for each.
May 7, 2026
32 min

A tiered CI strategy for zzcollab research compendia

ci
github-actions
reproducibility
zzcollab-compendia
renv
I did not realise how much my CI was lying to me until I added a single explicit failure check and watched five ‘passing’ projects turn honestly red. A workflow-style migration guide for zzcollab projects across four workspace types.
May 6, 2026
31 min

Setting up OBS for Live R Coding Screencasts

setup
obs
youtube
screencast
r
reproducibility
A reproducible workflow for producing short, focused screencasts of R data analysis using OBS Studio and YouTube. Includes installation, scene configuration, recording, editing, and a worked five-minute example based on the Palmer Penguins dataset.
May 2, 2026
20 min

From testthat to tinytest: Converting an R Package Test Suite

r
testing
tinytest
testthat
packaging
I did not really appreciate how much ceremony testthat carries until I rewrote a small package’s suite in tinytest and lost roughly two-thirds of the lines without losing any coverage.
May 2, 2026
54 min

A 55-Item Initiation Checklist for zzcollab Data Analyses

r
zzcollab-compendia
reproducibility
workflow
checklist
quarto
A clinical research collaborator emails a single CSV file with twelve columns, two hundred and eighty-seven rows, no codebook, and a promise of an updated extract ‘next week’. We walk through the fifty-five items that move that attachment from the inbox to a reproducible zzcollab compendium ready for archival.
Apr 29, 2026
83 min

Refactoring a Personal Toolbox: Scripts versus Shell Functions

shell
shell-and-git
Personal toolboxes accumulate helpers across years of small fixes: some end up as shell functions in ‘.zshrc’, some as scripts in ‘~/bin’, often with no consistent rule for which goes where. A principled split (function only when shell state must change, versioned script otherwise) removes hundreds of lines of logic from the average dotfile, makes every helper shellcheck-able, and reduces shell startup time.
Apr 25, 2026
28 min

Building a statistical computing textbook in the Age of AI

quarto
teaching
reproducibility
I did not really appreciate how much structural decision-making goes into a textbook until I tried to draft two at once, both under the ‘in the Age of AI’ framing.
Apr 24, 2026
19 min

A pocket terminal for your Linux laptop with ttyd and Tailscale

linux
shell
workflow-construct
A reproducible configuration for browser-based terminal access to a Linux laptop from a mobile device, using ttyd for the terminal emulation layer and Tailscale for authenticated network transport, with no public-internet exposure.
Apr 15, 2026
37 min

Dynamic Column Names in R: Seven Approaches Compared

r
metaprogramming
r-language
A comprehensive guide to adding columns with programmatically-generated names in R dataframes. Covers base R, tidyverse (classic and modern), data.table, collapse, rlang::inject(), and do.call patterns.
Feb 16, 2026
10 min

Updating an R Package: A Complete Development Workflow

r
package-development
git
I did not really understand the full lifecycle of modifying an R package until I had to push a feature branch through CI/CD and watch it either pass or fail on three operating systems. This post walks through the entire process.
Feb 11, 2026
14 min

Setting Up Neovim as a Data Science IDE

neovim
vim
r
python
I did not really know how much faster code editing could be until I switched from a mouse-driven IDE to Neovim’s modal, keyboard-centric workflow.
Feb 11, 2026
17 min

Reproducible Blog Posts with ZZCOLLAB: A Quarto Workflow

quarto
r
docker
reproducibility
zzcollab-compendia
I did not really appreciate how fragile technical blog posts are until one of my own stopped rendering six months after publication. This post documents the workflow I built to treat each blog post as a standalone ZZCOLLAB reproducible research project with Docker, renv, and CI/CD.
Feb 10, 2026
24 min

The Pipe Equivalence Myth: When f() |> g() Is Not the Same as g(f())

r
metaprogramming
r-language
Piping and nesting function calls are semantically different operations. A subtle bug in an expression- capturing wrapper reveals how R’s lazy evaluation interacts with the pipe operator.
Jan 31, 2026
14 min

Running ZZedc Independently for Clinical Research Data Management

clinical-trials
shiny
aws
reproducibility
I did not really know how achievable investigator independence in clinical data management was until I deployed ZZedc on a personal AWS instance and ran a pilot study without vendor involvement.
Dec 7, 2025
15 min

From Markdown to Blog Post: A ZZCOLLAB Conversion Workflow

quarto
zzcollab-compendia
reproducibility
A systematic workflow for converting standalone markdown documentation into professional blog posts using ZZCOLLAB symlinks and Quarto metadata.
Dec 2, 2025
12 min

Combining Observable JS and Shiny in a Single Quarto Document

quarto
shiny
javascript
r
data-visualization
I did not really know how difficult it would be to combine Observable JS and Shiny in one Quarto document until every data-loading approach I tried failed except fetching from a public URL.
Dec 1, 2025
13 min

Testing Data Analysis Workflows in R

r
testing
reproducibility
I did not really know how to systematically test a data analysis pipeline until I applied software engineering practices from testthat and assertr to my own research workflow.
Jul 25, 2025
16 min

Configuring Yabai as a Tiling Window Manager on macOS

macos
shell
I did not really know how much faster a tiling window manager could make my daily workflow until I configured yabai with keyboard shortcuts and stopped reaching for the mouse.
Jun 20, 2025
16 min

Writing a Simple Vim Plugin for REPL Interaction

vim
r
r-packages
I did not really know how Vim’s terminal API worked until I wrote a small plugin that sends code from an editing buffer to a running R session.
May 20, 2025
16 min

Prototyping a Shiny App with ChatGPT

r
shiny
I did not really know how effective ChatGPT could be as a prototyping partner until I iteratively built a modular Shiny app for Palmer Penguin exploration in three prompts.
May 15, 2025
19 min

A Mac Workflow for Tracking Daily Research Progress

shell
git
macos
I did not really know how to keep a consistent daily research log until I combined macOS dictation with ChatGPT summarization and a few short bash scripts. This post walks through the entire workflow from folder structure to searchable, version-controlled notes.
Apr 12, 2025
14 min

Clinical Trial Data Validation Across Languages and Tools

clinical-trials
data-cleaning
r
lua
javascript
I did not really understand how many layers of validation sit between a clinical trial data entry form and a reliable analysis dataset until I started mapping out EDC edit checks, open-source tools, and the possibility of generating JavaScript validation from a simple spreadsheet processed by Lua.
Jan 15, 2025
25 min

Palmer Penguins Part 5: Random Forest versus Linear Models

r
random-forest
machine-learning
model-selection
A head-to-head comparison reveals that a random forest outperforms the linear model by only two percentage points, prompting reflection on the interpretability-performance tradeoff.
Jan 5, 2025
15 min

Palmer Penguins Part 4: Model Diagnostics and Interpretation

r
regression
model-selection
Verifying regression model trustworthiness through systematic diagnostic checks on linearity, normality, and influence.
Jan 4, 2025
18 min

Palmer Penguins Part 3: Cross-Validation and Model Comparison

r
cross-validation
random-forest
machine-learning
penguins-arc
Testing whether regression models hold up on new data through ten-fold cross-validation and comparison against a random forest.
Jan 3, 2025
14 min

Palmer Penguins Part 2: Multiple Regression and Species Effects

r
regression
Adding species identity to the body mass prediction model causes R-squared to jump from 0.76 to 0.86, demonstrating the power of biological groupings in multiple regression.
Jan 2, 2025
15 min

Predictive Modeling of Penguin Body Mass

r
regression
data-visualization
Regression models on the Palmer Penguins dataset reveal how much species identity shapes morphometric relationships. Simpson’s Paradox emerges clearly, reversing an apparent correlation once species is accounted for.
Jan 1, 2025
20 min

Palmer Penguins Part 1: Exploratory Data Analysis and Simple Regression

r
regression
data-visualization
penguins-arc
An exploration of how a simple flipper measurement can reveal substantial information about penguin body mass through the Palmer Penguins dataset and simple linear regression.
Jan 1, 2025
19 min

Constructing a reproducible blog post using zzcollab tools

r
zzcollab-compendia
reproducibility
I didn’t really know much about [topic] until I tried to [implement/understand] it myself. Here’s what I learned along the way.
Jan 1, 2025
24 min
No matching items

    © 2026 Regents of the University of California. Ronald G. Thomas, UC San Diego Herbert Wertheim School of Public Health. Curriculum content: CC BY-NC-ND 4.0.

     
    • Accessibility

    • Privacy

    • ORCID