Statistical methods in in bioinformatics-apr-18

Date/Time
Date(s) - 2018-04-25 - 2018-04-27
09:00 CEST - 16:00 CEST

Categories

Statistical methods in in bioinformatics

April 25th – 27th, 2018, Room 202, EMBL Heidelberg

Instructor: Bernd Klaus, bernd.klaus@embl.de

This advanced course provides an overview of statistical methods that are
commonly used in the analysis of high throughput data sets. All methods
will be introduced using RNA-Seq (single cell and bulk) datasets.
A working knowledge of R is required for this course and can be obtained
via self learning using this material here:

No longer available (R intro materials)

The course should also be very suitable for people with some experience in another
scripting language like Python or Matlab. While the course focuses on R,
many of the techniques covered are also implementable in other scripting languages
like Python using e.g. the PANDAS, sklearn etc. modules.

The course will be a mix of lectures and hands-on training. Practicals will
consist of computer exercises that will enable the participants to apply statistical
methods to the analysis of data under the guidance of the lecturer and
possibly teaching assistants.

The course open to all at EMBL (including outstations) and free of charge.

Please register here below!

The current version of the course materials can be found here:

No longer available (course materials)

Timetable

Wednesday, April 25th, 2018

Data handling and tidy data

9:00 – 12:00
ca. 10:30 Coffee break

Basics of arithmetics and data handling in R
Data frames and tidy data

12:00 – 13:00 Lunch break

13:00 – 16:00
ca. 14:30 coffee break

Data handling with dplyr verbs
The "group-apply-combine" strategy for data analysis
Putting together tables with related information

Thursday, April 26th, 2018

Visual exploration for bioinformatics

9:00 – 12:00
ca. 10:30 Coffee break

Review of ggplot2 for elegant graphics
Regression and local regression (LOESS)
Normalization and variance stabilization of ([sc]-RNA-Seq) count data

12:00 – 13:00 Lunch break

13:00-16:00
ca. 14:30 coffee break

Heatmaps and clustering
Dimensionality reduction: PCA, MDS and t–SNE

Friday, April 27th, 2018

Factor models and machine learning

9:00 – 12:00
ca. 10:30 Coffee break

Factor analysis methods, dealing with batch effects
Statistical testing

12:00 – 13:00 Lunch break

13:00-16:00
ca. 14:30 coffee break

Resampling based clustering
Machine learning using randomForest

Bookings

❗ This event is fully booked. ❗