Data Science for the
Life Sciences
Apply for our January course. Limited spots available.
Online
2 live sessions/week
5 Weeks
Starts Jan. 22nd, 2024
Evenings
6pm EST
$990
Tailored curriculum and resources
Syllabus & Schedule
Course Dates: Jan. 22nd, 2024 to Feb. 22nd, 2024
Times: 6pm - 7:15pm EST every Monday and Thursday
Week 1
Reproducible Research
Understand and learn how to effectively use key tools and languages for reproducible research: Unix, Quarto, Git, Github, and more.
Week 2
Data Wrangling
Learn to effectively clean and normalize data into a form that can be used for data analysis. We will explore its importance and apply data wrangling skills using health data and R&D data.
Week 3
Data Visualization
Learn the power of using various visualizaiton techniques alongside statistics to find relationships and meaning out of large datasets such as vaccine effectiveness and COVID outbreaks.
Week 4
Machine Learning I
Explore fundamental machine learning prinicples and methods that can be applied to various settings. We will explore topics such as smoothing, cross-validation, and evaluation metrics. We will apply these to case studies such as image analysis.
Week 5
Machine Learning II
Continue to learn concepts that are used more frequently in the life science space such as clustering. We will go through a case study involving clustering tissues with gene expression data.
What You'll Gain
You will learn how to apply key statistical concepts in conjunction with data science skills used on real life science datasets. Hear from industry veterans within the space to understand what you need to succeed in the life sciences. Top performers of the course will have the opportunity to network with industry partners.
Data Wrangling
Data Visualization
Machine Learning
Industry Experts
Instructor
Professor Rafael Irizarry
Rafael Irizarry is currently a Professor and Chair of the Department of Data Science at the Dana-Farber Cancer Institute and a Professor of Biostatistics at Harvard School of Public Health.
​
Rafael is well-known for his work in genomics, specifically the analysis of high-throughput technologies data. His contributions to open-source statistical methodologies have positioned him as one of the most highly cited scientists in his field.