Stanford University Stanford University Jorge Roa, MSc
  • Home
  • Who I Am
  • Research
    • Peer-reviewed Publications
    • Forthcoming
    • Other Publications
  • Tutorials
  • Software
  • Portfolio
  • Media & Interviews
  • Travel

Tutorials

Data science and R programming tutorials by Jorge Roa — covering quanteda, deep learning fairness, PDF text extraction, multithreading, and migrating from Stata to R.

Tutorials

What I show here is a selection of tutorials that I have created when I was studying my masters in Berlin and along my professional career. I had a lot of fun doing them and just confirming my passion for R.

Counting without loops in R — difference arrays and cumulative sums — tutorial cover

Counting Without Loops: Difference Arrays and Cumulative Sums in R

R
Performance
Microsimulation
Algorithms
Tutorial

Age-based counting functions in a microsimulation are easy to write as nested loops and brutally slow at scale. This tutorial rebuilds them with a difference-array and cumulative-sum kernel — turning every range update into two events and one cumsum() — and benchmarks the speedup live in R.

May 22, 2026
Stata to R transition guide cover

Making the Transition — A Guide for Switching from Stata to R

R
Stata
Transition
Tutorial

If you use Stata for analysis but are curious about R, this is a side-by-side guide to the equivalents — working directories, packages, data loading, exploration, missing data, renaming and labelling variables, merges, IDs, frequencies, and tabulation.

Apr 23, 2023
PDF to text — Python tutorial cover

PDF to Text — Extract Text and Tables with Python

Python
PDF
Text Extraction
Tutorial

Stop relying on web converters or paid PDF tools. Use Python with pdfplumber and tabulate to extract clean text and structured tables from PDF documents — useful for data wrangling, NLP preprocessing, and any workflow where the source data lives in a PDF.

Apr 18, 2023
Fairness disparity chart by demographic group

Bias in AI: Detection and Mitigation

Python
Fairness
AIF360
Aequitas
ML
COMPAS
Tutorial

A hands-on COMPAS case study using Aequitas for group-level fairness metrics and AIF360 for reweighing-based bias mitigation. Covers demographic exploration, false-positive disparity by race, and pre-/post-processing mitigation with native Python chunks.

Dec 15, 2022
How I Met Your Mother promotional banner

How We Met Quanteda — Text Analysis with R

R
Quanteda
Text Analysis
NLP
Tutorial

A hands-on introduction to quanteda built around an unlikely teaching corpus: every line of dialogue from How I Met Your Mother. Covers corpus construction, preprocessing, document-feature matrices, similarity, networks, and collocations.

Nov 15, 2022
Multithreading data.table on Mac — tutorial cover

Enable Multithreading with data.table on Mac (Intel + Apple Silicon)

R
data.table
Performance
macOS
Tutorial

data.table on macOS ships single-threaded by default — Apple’s bundled clang has no OpenMP. This guide walks through installing LLVM via Homebrew, configuring ~/.R/Makevars, reinstalling data.table from source, and verifying the multi-core build on both Intel and Apple Silicon machines.

Jul 28, 2022
No matching items

© 2026  ·  JARC

Built with QuartoQuarto