Biostatistics and Machine Learning
We provide biostatistics and machine learning training as part of the National Bioinformatics Infrastructure Sweden (NBIS). Our courses and workshops support researchers and students working with data in life sciences, medicine, and related fields. Whether you're new to the field or an experienced researcher, we offer training that helps you apply statistical and machine learning methods to real-world research problems.

Our courses
Introduction to biostatistics and machine learning
The course is geared towards life scientists wanting to be able to understand and use basic statistical and machine learning methods. It would also suit those already applying biostatistical methods but have never got a chance to truly understand the basic statistical concepts, such as the commonly misinterpreted p-value.
Introduction to biostatistics and machine learning II
This course expands on common life science data analysis methods, including dimensionality reduction techniques beyond PCA, mixed-effects models for analysis of repeated measures, and survival analysis. We will also dive deeper into machine learning, covering more classification algorithms, ensemble techniques, optimization strategies and PLS methods for single and multi-omics data analysis.
Application & Registration of interest
We are now accepting applicants for Introduction to biostatistics and machine learning.
Introduction to biostatistics and machine learning
A Hands-on Approach to Biostatistics & ML
Gain a deep understanding of statistical modeling and machine learning methods through real-world applications in life sciences.Probability Theory
Understand the foundations of probability, including key concepts such as random variables, distributions, and expectations, essential for statistical modeling.
Hypothesis Testing & Confidence Intervals
Learn how to make informed decisions using statistical hypothesis testing, p-values, and confidence intervals to assess data-driven conclusions.
Resampling Techniques
Explore powerful methods like bootstrapping and permutation tests for estimating uncertainty and improving statistical inference.
Linear Regression & Generalized Linear Models
Develop a strong grasp of linear regression and extend it to generalized linear models (GLMs) for handling various types of response variables.
Model Evaluation & Validation
Master essential techniques for assessing model performance, including cross-validation, bias-variance tradeoff, and goodness-of-fit metrics.
Unsupervised Learning: Clustering & Dimension Reduction
Discover data-driven approaches like PCA, and clustering algorithms (k-means, hierarchical clustering) to uncover hidden patterns in data.
Supervised Learning & Classification
Dive into machine learning classification techniques, covering logistic regression, decision trees, and random forests.
Master Advanced Biostatistics & Machine Learning
Take your data analysis skills to the next level with cutting-edge statistical and ML techniques.Advanced Dimensionality Reduction
Explore techniques beyond PCA, such as UMAP and t-SNE, for high-dimensional data visualization and feature extraction.
Comprehensive Classification & Ensemble Methods
Master algorithms like Random Forest, and SVM while understanding ensemble techniques like bagging, boosting, and stacking.
Machine Learning Optimization Strategies
Learn hyperparameter tuning, cross-validation, and model selection techniques to optimize performance in ML workflows.
PLS Methods for Multi-Omics Analysis
Implement PLS, PLS-DA, and sPLS for analyzing complex biological datasets, enabling effective variable selection and feature extraction.
Mixed-Effects Models & Longitudinal Data
Apply mixed models for repeated measures, nested designs, and longitudinal studies to capture variability in complex biological data.
Survival Analysis & Cox Regression
Analyze censored data, estimate survival curves with Kaplan-Meier methods, and model time-to-event data using Cox proportional hazards models.
Introduction to Neural Networks & Deep Learning
Gain foundational knowledge of CNNs and RNNs, and explore the application of large language models in life sciences.
Final Integration Challenge
Apply ML workflows and statistical models to real-world datasets in a hands-on challenge synthesizing course concepts.
Frequently Asked Questions
What are the prerequisites for the Introduction to Biostatistics and Machine Learning course?
Participants should have basic R programming skills, including using R as a calculator, working with vectors and matrices, reading and writing CSV files, using built-in summary functions, creating simple plots, and basic programming constructs like loops and if-else statements. No prior biostatistical knowledge is required, only basic math skills. Pre-course study materials will be available upon course acceptance.
What are the prerequisites for the Advanced Biostatistics and Machine Learning course?
Participants should have a basic understanding of descriptive statistics, hypothesis testing, and linear regression. They should also have experience with both R and Python for data science, including using NumPy, pandas, and Matplotlib for data manipulation and visualization in Python. Familiarity with machine learning concepts is recommended.
Do I need to bring anything for the course?
Yes, participants must bring their own laptop (BYOL) with R and R Studio installed. For the second course, Python should also be installed.
How are participants selected?
Due to limited space (maximum 24 participants), selection is based on meeting entry requirements, motivation for attending, and gender/geographical balance. NBIS prioritizes academic participants in Sweden but may accept industry professionals and international applicants if seats are available.
How much does the course cost?
The fee is 3,000 SEK for academic participants and 15,000 SEK for non-academic participants. This includes lunches and coffee. Please note that NBIS cannot invoice individuals.
Do I need prior biostatistics knowledge for the first course?
No, the first course assumes no prior biostatistics knowledge. Only basic math skills are required, and additional pre-course materials will be provided upon acceptance.
Does the second course cover deep learning?
Yes, the second course introduces neural networks, including CNNs and RNNs, and explores the application of large language models (LLMs) in life sciences.
Get in touch
edu.ml-biostats@nbis.se