Author

Otho Mantegazza

Slides

Introduction:

  1. Introduction

Hands on data with R:

  1. Meet R
  2. Manipulate Data
  3. Missing Values
  4. Visualize Data
  5. Notes on Correlation
  6. Robust Statistics
  7. Get Data into R
  8. Clean Data with R
  9. Explorative Data Analysis

Explore your data with statistical models:

  1. A Few Words on Statistical Models
  2. Supervised Learning - A glimpse to Linear Models and an introduction to Tidymodels
  3. [🚧 Work in Progress ⚠️] Unsupervised Learning - Explore your Data with PCA

Tools

R and Rstudio

You can run within Visual Studio Code, in the docker container provided by the summer school organizers.

Otherwise, you can also:

Remeber! R works with packages.

Install a package

First install the package with install.packages() (you only have to do it once).

Load a package

Then load it with library(), to make it’s functions available. (you have to do it at the beginning of each of your scripts).

Packages that we are going to use:

Please install this packages:

install.packages(c('tidyverse', 'palmerpenguins', 'here', 'broom', 'janitor'))

and place this snippet of code in front of all your scripts.

library(tidyverse)
library(here)
library(palmerpenguins)
library(broom)
library(janitor)

Great Books About Data Analysis

These are the textbooks that I love and that I use as a daily reference. They are all openly accessible.

R

  • R for Data Science: An introduction to data analysis with R/Tidyverse by Hadley Wickham and Garret Grolemund.
  • Introduction to Data Science - A detailed introduction to Data science by the biostatistician Rafael A. Irizarry.
  • Advanced R - All you wish to know about programming in R by Hadley Wickham.
  • Introduction to Statistical Learning - A detailed introductio to modern statistical methods, implemented in R by Gareth James, Jeffrey Heer, Dominik Moritz, Jake VanderPlas, and Brock Craft, Trevor Hastie and Rob Tibshirani.
  • Text Mining in R Analyzing natural language and written text in R, by Julia Silge and David Robinson.
  • Tidy Modeling with R An introduction to the tools that compose R’s machine learning framework, by Max Kuhn and Julia Silge.
  • Analising Data Using Linear Models, for students in social, behavioural and management science, by Stéphanie M. van den Berg.

Python

Javascript

Git / Github

Project management

Dataviz Design

Dashboards

Computer Science

  • Missing Semester A generic intro to basic CS productivity tips and tools, by Anish Athalye.

Bayesian Statistics in R and Python

Geocomputation

  • Geocomputation with R; a book on geographic data analysis, visualization and modeling by Robin Lovelace, Jakub Nowosad and Jannes Muenchow.
  • Spatial Data Science; concepts, packages and models for spatial data science in R, by Edzer Pebesma, Roger Bivand.

More Books at Bookdown

  • Check out the bookdown repository for many more.

Authors

Support the authors of these textbooks with the means that are available to you, they are heroes.

Source Code

The source code for this course is available on Github.