Basic statistics
- Population vs Sample
- Mean, median and mode
- The geometric mean
- Weighted mean
- Variables and scales
- Range, interquartile range and box plots
- How to identify and deal with outliers
- Standard deviations and error bars
- Why we divide by n-1 and not n when we calculate the SD
- The normal distribution
- The central limit theorem
- The standard error of the mean (SEM)
- Confidence intervals
- The t-distribution – why we need it
- The one-sample t-test and p-values
- t-test VS confidence intervals
- The degrees of freedom
- The basic steps of hypothesis testing
- The unpaired t-test (independent samples t-test)
- How to check normal distribution
- One-way ANOVA: the basics
- One-way ANOVA: the calculations
- The paired t-test
- Paired vs unpaired t-test
- The repeated-measures ANOVA
- One-proportion Z-test and the corresponding confidence interval
- The Chi-square goodness of fit test and the difference to the one-proportion Z-test
- The two proportion z-test and the Chi-square test of homogeneity
- The Chi-square test of independence VS homogeneity and goodness of fit
- McNemar test
- A deeper understanding of p-values
- How to compute a p-value with R
Non-parametric tests
- The Mann Whitney U test (Wilcoxon Mann Whitney test) part 1/2
- The Mann Whitney U test (Wilcoxon Mann Whitney test) part 2/2 | exact p-value
- The Wilcoxon signed-rank test & the sign test
Type 1 and 2 errors and power
- The basics of type 1 and 2 errors
- The probability of making a type 1 error
- The probability of making a type 2 error (part 1/2)
- The probability of making a type 2 error (part 2/2)
- Statistical power and sample size calculations
- Statistical power – Parametric vs Nonparametric test
Correlation
- Correlation – the basics | Pearson correlation
- Correlation | hypothesis testing | assumptions
- Spearman’s rank correlation | Pearson VS Spearman
Linear regression
- Linear regression – the basics
- Linear regression – least squares
- Linear regression – hypothesis testing
- Linear regression – R2
- Multiple linear regression
- Linear regression – assumptions
The expected value and the math behind n-1
How to select an appropriate statistical test
Regression vs ANOVA vs t-test
Two-way ANOVA
Post-hoc tests
- The familywise error rate (FWER)
- Fisher’s LSD method
- Bonferroni
- Holm
- Tukey’s test and Dunnett’s test
- How to select an appropriate post-hoc test
Tests based on the false discovery rate (FDR)
Generalized linear models (GLMs)
Logistic regression
- Logistic regression 1: the basics
- Logistic regression 2: classification
- Logistic regression 3: Likelihood and deviance
- Logistic regression 4: Likelihood ratio test and AIC
Poisson regression
- The Poisson distribution vs the normal distribution
- Poisson regression 1 – the basics
- Poisson regression 2 – rates and the offset
- Poisson regression 3 – categorical variables
- Poisson regression 4 – how to calculate the likelihood and the deviance
- Poisson regression 5 – compare models with the likelihood ratio test and AIC
- Quasi-Poisson and negative binomial regression models
- Zero-inflated Poisson (ZIP) regression
Linear mixed-effect models
Bayesian statistics
Introduction to multivariate statistics
To understand the methods that are used in multivariate statistics, you need to understand some basic linear algebra. For example, you need to understand things like matrix operations, eigenvectors and eigenvalues. You also need to understand the meaning of covariance and different distances in space.
Linear algebra
- Matrices and matrix operations – part 1
- Matrices and matrix operations – part 2
- Eigenvectors and eigenvalues – the basics
- Eigenvectors and eigenvalues – the math
Covariance and distances
Once you know the basics in the above videos, you can start to learn about multivariate statistical methods. I recommend that you start to learn about PCA.
PCA
- PCA 1: the basics
- PCA 2: the math
- PCA 3: standardization and extract
- PCA 4: interpret the weights and Varimax rotation
LDA
Multivariate statistical methods
Metrics used for binary classification and validation
- Sensitivity and specificity
- The positive and negative predictive values
- The ROC curve
- Validation (cross-validation, hold-out, LOOCV)
- Likelihood ratio
Classification methods
Logistic regression
- Logistic regression 1: the basics
- Logistic regression 2: classification
- Multinomial logistic regression
Linear discriminant analysis
k-nearest neighbors and Mahalanobis distance
Decision trees and random forest
Support vector machines
Gaussian naive Bayes
Clustering methods
Partial least squares regression
Canonical correlation analysis
Neural Networks
Survival analysis
- Kaplan Meier curve
- Comparing Kaplan-Meier curves – the Log-rank test
- The Cox proportional hazards model explained
Gene set analysis
In gene set analysis, one usually uses Fisher’s exact test to identify an overrepresented set of genes. To understand Fisher’s exact test, we first need to understand a few things about permutations and combinations.
- Permutations Combinations and the Hypergeometric distribution
- Fisher’s test and how to calculate the exact p-value
- Gene set analysis
Model selection
- Model selection with AIC and AICc
- Forward and backward selection, best subset selection
- Lasso regression
Statistics in epidemiology
Additional videos
Mathematical and computational biology/ systems biology
Some basic math
- Understanding the equation of a straight line
- Logarithms
- Derivatives – the basics
- Second and partial derivatives
- Numerical differentiation
- Why do we use Euler’s number e?
- Exponential growth
- Doubling time with examples
- Exponential decay – a cup of coffee
- Ordinary differential equations (ODE)
- Euler’s method
- How to solve ordinary differential equations (ODEs) in R (deSolve)
- How to build a system of differential equations (ODEs)
- Receptor ligand kinetics
- The SIR model | the math of epidemics
Optimization – find the minimum value of a function
Nonlinear regression
- Nonlinear regression – the basics
- Nonlinear regression – comparing models with F test and AIC
- Nonlinear regression – how to fit a dose-response curve in R
- Nonlinear regression – how to fit a logistic growth model to data
- Nonlinear mixed effects models (NLME)
Cellular automata – spatial modeling