class: center, middle, inverse, title-slide # Introduction to R the Tidy Way ## R and RStudio Basics ### Mine Dogucu ### 2020-03-05 --- layout: true <!-- This file by Mine Dogucu is licensed under a Attribution-ShareAlike 2.5 Generic License (CC BY-SA 2.5) More information about the license can be found at https://creativecommons.org/licenses/by-sa/2.5/ --> <div class="my-header"></div> <div class="my-footer"> CC BY-SA <a href="https://mdogucu.ics.uci.edu">Mine Dogucu</a></div> --- class: center, middle ## License <img src="img/cc-sa.png" width="100%" /> More information can be found [here](https://creativecommons.org/licenses/by-sa/2.5/) --- ## RStudio Cloud - Log into RStudio Cloud using the link on the whiteboard - From Spaces on the left navigation bar intro-r-tidy should be selected. - In this workspace, make a copy of the pre-lunch project. <img src="img/copy.png" width="100%" /> --- ## No Temporary Project Your project should NOT be temporary. If it says so then save a permanent copy. If you already made a copy you should be good. <img src="img/temp.png" width="100%" /> --- ## .Rmd - On the lower right pane in the `Files` tab open the `sample-rmarkdown.Rmd` file. --- ## Knitting - Click the Knit button to knit this document and then we will talk about R Markdown documents. <img src="img/knit.png" width="100%" /> --- ## Our very own .Rmd - Open `1_basics.Rmd` - I will do a live demo --- ## Demo - Adding Chunks <img src="img/code-chunk.png" width="100%" /> ## Demo - Run Code <img src="img/run-code.png" width="100%" /> --- ## Demo - Defining Objects ```r birth_year <- 1950 # note the change in the environment age <- 2020 - birth_year age ``` ``` ## [1] 70 ``` --- ## Demo - Defining Objects R is case sensitive. ```r Age ``` ``` ## Error in eval(expr, envir, enclos): object 'Age' not found ``` --- ## Demo - Defining Objects ```r name_p <- c("Molly", "Zainab", "Mohsen") #c is a combine function # strings come in quotes kid_p <- c(2, 3, 2) ``` --- ## Demo - Defining Objects ```r data.frame(name = name_p, kid = kid_p) ``` ``` ## name kid ## 1 Molly 2 ## 2 Zainab 3 ## 3 Mohsen 2 ``` --- ## Vocabulary do(something) do is a function something is an argument do(something colorful) do is a function something is an argument colorful is an argument --- ## Looking at Data Get to know the `candy_ranking` data. - Clicking on the data frame in the Environment. - Clicking on the blue button. - Using `glimpse()` --- ## Looking at Data - `glimpse()` `glimpse()` shows variables and some data values but it is not in the usual row-column format but rather flipped version of this with additional information about the variables and the data frame. --- ## Looking at Data - `glimpse()` ```r glimpse(candy_rankings) ``` ``` ## Observations: 85 ## Variables: 13 ## $ competitorname <chr> "100 Grand", "3 Musketeers", "One dime", "One... ## $ chocolate <lgl> TRUE, TRUE, FALSE, FALSE, FALSE, TRUE, TRUE, ... ## $ fruity <lgl> FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, FALS... ## $ caramel <lgl> TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE... ## $ peanutyalmondy <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, TRUE... ## $ nougat <lgl> FALSE, TRUE, FALSE, FALSE, FALSE, FALSE, TRUE... ## $ crispedricewafer <lgl> TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALS... ## $ hard <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FAL... ## $ bar <lgl> TRUE, TRUE, FALSE, FALSE, FALSE, TRUE, TRUE, ... ## $ pluribus <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FAL... ## $ sugarpercent <dbl> 0.732, 0.604, 0.011, 0.011, 0.906, 0.465, 0.6... ## $ pricepercent <dbl> 0.860, 0.511, 0.116, 0.511, 0.511, 0.767, 0.7... ## $ winpercent <dbl> 66.97173, 67.60294, 32.26109, 46.11650, 52.34... ``` --- class: center, middle ## More on Candy Rankings [The Ultimate Halloween Candy Power Ranking](https://fivethirtyeight.com/features/the-ultimate-halloween-candy-power-ranking/) --- ## Schedule for the Day __08:45 - 09:00 Introduction__ __09:00 - 09:15 Getting to Know the Basics__ 09:15 - 10:15 Data Visualization 10:15 - 10:30 Break 10:30 - 12:00 Data Wrangling 12:00 - 01:00 Lunch 01:00 - 01:30 Working Locally With R 01:30 - 02:00 Dealing with Datasets 02:00 - 02:30 Case Study 02:30 - 02:45 Break 02:45 - 03:30 Modeling 03:30 - 04:00 Everything I did not have time to cover