Biostatistics and Machine Learning

Biostats

Our courses

Introduction to biostatistics and machine learning

The course is geared towards life scientists wanting to be able to understand and use basic statistical and machine learning methods. It would also suit those already applying biostatistical methods but have never got a chance to truly understand the basic statistical concepts, such as the commonly misinterpreted p-value.

Introduction to biostatistics and machine learning II

This course expands on common life science data analysis methods, including dimensionality reduction techniques beyond PCA, mixed-effects models for analysis of repeated measures, and survival analysis. We will also dive deeper into machine learning, covering more classification algorithms, ensemble techniques, optimization strategies and PLS methods for single and multi-omics data analysis.

Application & Registration of interest

Introduction to biostatistics and machine learning

A Hands-on Approach to Biostatistics & ML

Gain a deep understanding of statistical modeling and machine learning methods through real-world applications in life sciences.

Probability Theory

Understand the foundations of probability, including key concepts such as random variables, distributions, and expectations, essential for statistical modeling.

Hypothesis Testing & Confidence Intervals

Learn how to make informed decisions using statistical hypothesis testing, p-values, and confidence intervals to assess data-driven conclusions.

Resampling Techniques

Explore powerful methods like bootstrapping and permutation tests for estimating uncertainty and improving statistical inference.

Linear Regression & Generalized Linear Models

Develop a strong grasp of linear regression and extend it to generalized linear models (GLMs) for handling various types of response variables.

Model Evaluation & Validation

Master essential techniques for assessing model performance, including cross-validation, bias-variance tradeoff, and goodness-of-fit metrics.

Unsupervised Learning: Clustering & Dimension Reduction

Discover data-driven approaches like PCA, and clustering algorithms (k-means, hierarchical clustering) to uncover hidden patterns in data.

Supervised Learning & Classification

Dive into machine learning classification techniques, covering logistic regression, decision trees, and random forests.

Master Advanced Biostatistics & Machine Learning

Take your data analysis skills to the next level with cutting-edge statistical and ML techniques.

Advanced Dimensionality Reduction

Explore techniques beyond PCA, such as UMAP and t-SNE, for high-dimensional data visualization and feature extraction.

Comprehensive Classification & Ensemble Methods

Master algorithms like Random Forest, and SVM while understanding ensemble techniques like bagging, boosting, and stacking.

Machine Learning Optimization Strategies

Learn hyperparameter tuning, cross-validation, and model selection techniques to optimize performance in ML workflows.

PLS Methods for Multi-Omics Analysis

Implement PLS, PLS-DA, and sPLS for analyzing complex biological datasets, enabling effective variable selection and feature extraction.

Mixed-Effects Models & Longitudinal Data

Apply mixed models for repeated measures, nested designs, and longitudinal studies to capture variability in complex biological data.

Survival Analysis & Cox Regression

Analyze censored data, estimate survival curves with Kaplan-Meier methods, and model time-to-event data using Cox proportional hazards models.

Introduction to Neural Networks & Deep Learning

Gain foundational knowledge of CNNs and RNNs, and explore the application of large language models in life sciences.

Final Integration Challenge

Apply ML workflows and statistical models to real-world datasets in a hands-on challenge synthesizing course concepts.

Frequently Asked Questions

What are the prerequisites for the Introduction to Biostatistics and Machine Learning course?

Participants should have basic R programming skills, including using R as a calculator, working with vectors and matrices, reading and writing CSV files, using built-in summary functions, creating simple plots, and basic programming constructs like loops and if-else statements. No prior biostatistical knowledge is required, only basic math skills. Pre-course study materials will be available upon course acceptance.

What are the prerequisites for the Advanced Biostatistics and Machine Learning course?

Participants should have a basic understanding of descriptive statistics, hypothesis testing, and linear regression. They should also have experience with both R and Python for data science, including using NumPy, pandas, and Matplotlib for data manipulation and visualization in Python. Familiarity with machine learning concepts is recommended.

Do I need to bring anything for the course?

Yes, participants must bring their own laptop (BYOL) with R and R Studio installed. For the second course, Python should also be installed.

How are participants selected?

Due to limited space (maximum 24 participants), selection is based on meeting entry requirements, motivation for attending, and gender/geographical balance. NBIS prioritizes academic participants in Sweden but may accept industry professionals and international applicants if seats are available.

How much does the course cost?

The fee is 3,000 SEK for academic participants and 15,000 SEK for non-academic participants. This includes lunches and coffee. Please note that NBIS cannot invoice individuals.

Do I need prior biostatistics knowledge for the first course?

No, the first course assumes no prior biostatistics knowledge. Only basic math skills are required, and additional pre-course materials will be provided upon acceptance.

Does the second course cover deep learning?

Yes, the second course introduces neural networks, including CNNs and RNNs, and explores the application of large language models (LLMs) in life sciences.

Get in touch

edu.ml-biostats@nbis.se