00:12:03 Gabriela Gonzalez: is lab03 not showing up for anyone else on RStudio? 00:13:20 Julia Hankin: Mine’s showing up, Gabby! That’s weird :( 00:13:24 Silvana Larrea: Hi Gabriela, mine is showing too 00:13:50 Gabriela Gonzalez: rstudio hasn’t been on my side this morning :’( 00:14:03 Silvana Larrea: I think last week there were also other students that had this problem. I think there is a thread about this problem in Piazza 00:14:12 Gabriela Gonzalez: I’ll check it out! Thanks : 00:14:14 Gabriela Gonzalez: :) 00:14:53 Olufunke Fasawe: Hi Kelsey, this is Lab 03, right? 00:15:13 Olufunke Fasawe: I don't see the Rmd file in the labs folder on Rstudio 00:16:13 Kelsey MacCuish: yes lab 3! 00:16:42 Kelsey MacCuish: are you all going to data hub through the course website? 00:16:53 Olufunke Fasawe: Thanks. I now see it. 00:16:56 Kelsey MacCuish: make sure you click the tab on the left on the course website - this pulls in all of the most recent files 00:17:00 Olufunke Fasawe: Went through the course website 00:19:19 Lupita Ambriz: I want to catch up on school work this weekend, I feel like midterms are coming up fast! 00:19:25 Kelsey MacCuish: https://docs.google.com/forms/d/e/1FAIpQLScAbuJoJnmsWybUBScVMsyBJBekBv3-TPI3IjxYismXD94TTQ/viewform?usp=sf_link 00:19:34 Olufunke Fasawe: I am moving to Berkeley from Nigeria this weekend! So, I will be travelling for almost 24 hours 00:27:02 Samantha Krutzfeldt: hello, sorry but my lab is different 00:27:06 Samantha Krutzfeldt: mine too 00:29:50 Jessica Wright: univariate 00:29:57 Olufunke Fasawe: Univariate 00:29:58 Karm Singh (she/her): univariate 00:30:12 Olufunke Fasawe: You use line plots for bivariate data 00:37:13 Samantha Krutzfeldt: I'm confused which function overrides the original data set? 00:55:21 Phoenix Ding: p1 <- ggplot(CS_data, aes(x = GDP_2006, y = CS_rate_100))+ geom_point() p1 00:55:21 Hiruni Jayasekera (she/her): ggplot(data = CS_data, aes(x = GDP_2006, y = CS_rate_100)) + geom_point() + labs(x = "GDP 2006", y = "CS_rate_100") 00:55:22 Silvana Larrea: ggplot(CS_data, aes(x = GDP_2006, y = CS_rate_100)) + geom_point() + labs (title = "Countries cesarean rate by GDP in 2006", x = "Gross Domestic Product 2006", y = "Cesarean rate" ) + theme_minimal () 00:59:17 Olufunke Fasawe: No 00:59:23 Julia Hankin: Nope! 01:01:46 Julia Hankin: CS_data_log <- CS_data %>% mutate(log_CS = log(CS_rate_100), log_GDP = log(GDP_2006)) 01:06:04 Ijeoma Uche: p3 <-ggplot(data=CS_data_log, aes(x=log_GDP, y=log_CS))+ geom_point(col= "blue")+ labs (x="GDP", y= "Cesarean rate",title= "Cesarean Rate by GDP using the natural log (base e)") p3 01:06:05 Hiruni Jayasekera (she/her): p3 <- ggplot(data = CS_data_log, aes(x = log_GDP, y = log_CS)) + geom_point()+ labs(x = "log GDP", y = "log CS rate") + theme_minimal() 01:06:06 Silvana Larrea: ggplot(CS_data_log, aes(x = log_GDP, y = log_CS)) + geom_point () + labs (x = "GDP in 2006", y = "Cesaren Rate") + theme_minimal () 01:08:11 Silvana Larrea: Like the first 50% of the data 01:08:30 Momoh Bona : halfway linear 01:09:13 Annalisa Watson (she/her): ggplot(CS_data_log, aes(x= log_GDP, y= log_CS)) + geom_point() + geom_smooth() 01:09:16 Ijeoma Uche: p4 <-ggplot(data=CS_data_log, aes(x=log_GDP, y=log_CS))+ geom_point(col= "blue")+ geom_smooth()+ labs (x="GDP", y= "Cesarean rate",title= "Cesarean Rate by GDP using the natural log (base e)") p4 01:09:31 Momoh Bona : p3+geom_smooth 01:11:09 Momoh Bona : -CS_data_log %>% ggplot(aes(x=log_GDP, y=log_CS))+ geom_point(aes(col=Income_Group))+ geom_smooth() 01:16:25 Annalisa Watson (she/her): Is there a place we can see the quiz answers? 01:17:01 Annalisa Watson (she/her): Oh thank you! I didn’t realize u could click on the answers 01:18:26 Phoenix Ding: CS_data_sub <- CS_data_log %>% filter(Income_Group %in% c("Low income", "Lower middle income", "Upper middle income")) CS_data_sub 01:19:21 Olufunke Fasawe: Will that be the select function? 01:19:25 Silvana Larrea: I did this, but R tells me that the object Income_Group is not found 01:19:25 Silvana Larrea: CS_data_sub <- CS_data_log %>% filter (Income_Group %in% c("Low income", "Lower middle income", "Upper middle income")) 01:19:31 Chitra Nambiar: group_by and filter? 01:19:40 Momoh Bona : filter 01:20:36 Stacy (Seohyun) Ahn: im having same issue as Silvana 01:20:43 Hiruni Jayasekera (she/her): same here 01:21:09 Annalisa Watson (she/her): I think that happened to me when I didn’t uppercase the first letter of the variable names 01:21:13 Aliza Adler: Same with me! And I know Income Group is a variable because I can see it in the CS_data_log 01:21:50 Aliza Adler: I have it in the right case but it still says not found 01:21:55 Stacy (Seohyun) Ahn: yea same 01:21:57 Hiruni Jayasekera (she/her): my variable names are capitalized :/ you mean like “Low income”, “Lower middle income”, etc 01:22:26 Momoh Bona : CS_data_log %>% filter(Income_Group %in% c("Low income", "Lower middle income", "Upper middle income")) 01:23:20 Hiruni Jayasekera (she/her): ooh that worked 01:23:21 Stacy (Seohyun) Ahn: omg finally worked thanks Kelsey ily 01:23:24 Silvana Larrea: Yes, that worked! 01:23:25 Silvana Larrea: Thanks! 01:23:32 Hiruni Jayasekera (she/her): thanks! 01:30:07 Phoenix Ding: I do not think we can get same answer if we use == 01:31:07 Phoenix Ding: p8 <- ggplot(CS_data_sub, aes(x = log_GDP, y = log_CS))+ geom_point(aes(col=Income_Group))+geom_smooth() p8 01:31:19 Momoh Bona : CS_data_sub %>% ggplot(aes(x=log_GDP, y=log_CS))+ geom_point(aes(col=Income_Group))+ geom_smooth() 01:31:52 Ijeoma Uche: no 01:32:14 Phoenix Ding: I think it is just look more linear 01:34:45 Lian Hsiao: I’m not sure why my plot has more points than the one on the screen? Tho I’m still passing the tests 01:34:51 Hiruni Jayasekera (she/her): same here 01:34:53 Phoenix Ding: same 01:34:53 Shannon Mohler: Same 01:34:59 Samantha Krutzfeldt: me too 01:35:11 Phoenix Ding: How many objects do you have in CS_data_sub? 01:35:32 Jessica Wright: mine only has one point 01:35:37 Ijeoma Uche: same 01:35:37 Annalisa Watson (she/her): Is color = the same as col = ? 01:35:39 Momoh Bona : I have more points! 01:35:51 Lian Hsiao: Got it, thank you! 01:35:55 Phoenix Ding: Thank you 01:36:05 Ijeoma Uche: I have the same code but my graph looks more cramped 01:36:10 Momoh Bona : oh thanks! 01:36:41 Annalisa Watson (she/her): Thx!! 01:37:12 Ijeoma Uche: p9 <- lm(log_CS ~ log_GDP,CS_data_sub) p9 01:41:50 Ijeoma Uche: Slope:0.8193 01:41:59 Julia Hankin: No idea if this is right, but: a one unit change in the number of GDP is associated with an increase of cesarean delivery rate of 0.8193. 01:48:08 Chitra Nambiar: (-3.9405 * log(2000) )+ 0.8193 01:48:31 Phoenix Ding: 9.8? 01:51:16 Stacy (Seohyun) Ahn: take exp 01:56:21 Ijeoma Uche: Why is it a percent 01:57:04 Momoh Bona : can you please review p9? 01:57:29 Momoh Bona : I got a different value! 01:58:10 Phoenix Ding: I say no, because 5000 is on the far right end, kind of not linear any more 01:58:40 Silvana Larrea: It would not be appropriate, because we would be extrapolating this information, since there is no country in our data with a GDP of 50,000 and we could get back a biased result. Looking at the data in the full dataset, I think that the linear model would over-predict the cesarean rate 01:59:16 Annalisa Watson (she/her): How can you tell it’s outside the range of our data? From the graph or the data set? 02:02:20 Ijeoma Uche: So would it over predict? 02:02:43 Shannon Mohler: I have to head out, thank you so much! 02:03:31 Chitra Nambiar: under 02:03:36 Chitra Nambiar: sorry, over 02:03:48 Stacy (Seohyun) Ahn: over 02:07:06 Yulan Xie: Thank you! 02:07:09 Stacy (Seohyun) Ahn: Thanks so much! 02:07:14 Silvana Larrea: Thank you! 02:07:14 Aliza Adler: Thank you!! 02:07:17 Joyce Qiao: thanks so much! 02:07:19 Annalisa Watson (she/her): Thank you!