00:12:03	Gabriela Gonzalez:	is lab03 not showing up for anyone else on RStudio?
00:13:20	Julia Hankin:	Mine’s showing up, Gabby! That’s weird :(
00:13:24	Silvana Larrea:	Hi Gabriela, mine is showing too
00:13:50	Gabriela Gonzalez:	rstudio hasn’t been on my side this morning :’(
00:14:03	Silvana Larrea:	I think last week there were also other students that had this problem. I think there is a thread about this problem in Piazza
00:14:12	Gabriela Gonzalez:	I’ll check it out! Thanks :
00:14:14	Gabriela Gonzalez:	:)
00:14:53	Olufunke Fasawe:	Hi Kelsey, this is Lab 03, right?
00:15:13	Olufunke Fasawe:	I don't see the Rmd file in the labs folder on Rstudio
00:16:13	Kelsey MacCuish:	yes lab 3!
00:16:42	Kelsey MacCuish:	are you all going to data hub through the course website?
00:16:53	Olufunke Fasawe:	Thanks. I now see it.
00:16:56	Kelsey MacCuish:	make sure you click the tab on the left on the course website - this pulls in all of the most recent files
00:17:00	Olufunke Fasawe:	Went through the course website
00:19:19	Lupita Ambriz:	I want to catch up on school work this weekend, I feel like midterms are coming up fast!
00:19:25	Kelsey MacCuish:	https://docs.google.com/forms/d/e/1FAIpQLScAbuJoJnmsWybUBScVMsyBJBekBv3-TPI3IjxYismXD94TTQ/viewform?usp=sf_link
00:19:34	Olufunke Fasawe:	I am moving to Berkeley from Nigeria this weekend! So, I will be travelling for almost 24 hours
00:27:02	Samantha Krutzfeldt:	hello, sorry but my lab is different
00:27:06	Samantha Krutzfeldt:	mine too
00:29:50	Jessica Wright:	univariate
00:29:57	Olufunke Fasawe:	Univariate
00:29:58	Karm Singh (she/her):	univariate
00:30:12	Olufunke Fasawe:	You use line plots for bivariate data
00:37:13	Samantha Krutzfeldt:	I'm confused which function overrides the original data set?
00:55:21	Phoenix Ding:	p1 <- ggplot(CS_data, aes(x = GDP_2006, y = CS_rate_100))+
  geom_point()
p1
00:55:21	Hiruni Jayasekera (she/her):	ggplot(data = CS_data, aes(x = GDP_2006, y = CS_rate_100)) + geom_point() + labs(x = "GDP 2006", y = "CS_rate_100")
00:55:22	Silvana Larrea:	ggplot(CS_data, aes(x = GDP_2006, y = CS_rate_100)) +
  geom_point() + 
  labs (title = "Countries cesarean rate by GDP in 2006", x = "Gross Domestic Product 2006", y = "Cesarean rate" ) +
  theme_minimal ()
00:59:17	Olufunke Fasawe:	No
00:59:23	Julia Hankin:	Nope!
01:01:46	Julia Hankin:	CS_data_log <- CS_data %>% mutate(log_CS = log(CS_rate_100), 
                                  log_GDP = log(GDP_2006))
01:06:04	Ijeoma Uche:	p3 <-ggplot(data=CS_data_log, aes(x=log_GDP, y=log_CS))+ geom_point(col= "blue")+
  labs (x="GDP", y= "Cesarean rate",title= "Cesarean Rate by GDP using the natural log (base e)")
p3
01:06:05	Hiruni Jayasekera (she/her):	p3 <- ggplot(data = CS_data_log, aes(x = log_GDP, y = log_CS)) + geom_point()+ labs(x = "log GDP", y = "log CS rate") + theme_minimal()
01:06:06	Silvana Larrea:	ggplot(CS_data_log, aes(x = log_GDP, y = log_CS)) + 
  geom_point () +
  labs (x = "GDP in 2006", y = "Cesaren Rate") + 
  theme_minimal ()
01:08:11	Silvana Larrea:	Like the first 50% of the data
01:08:30	Momoh Bona :	halfway linear
01:09:13	Annalisa Watson (she/her):	ggplot(CS_data_log, aes(x= log_GDP, y= log_CS)) + geom_point() + geom_smooth()
01:09:16	Ijeoma Uche:	p4 <-ggplot(data=CS_data_log, aes(x=log_GDP, y=log_CS))+ geom_point(col= "blue")+
  geom_smooth()+
  labs (x="GDP", y= "Cesarean rate",title= "Cesarean Rate by GDP using the natural log (base e)")
p4
01:09:31	Momoh Bona :	p3+geom_smooth
01:11:09	Momoh Bona :	-CS_data_log %>%
  ggplot(aes(x=log_GDP, y=log_CS))+
  geom_point(aes(col=Income_Group))+
  geom_smooth()
01:16:25	Annalisa Watson (she/her):	Is there a place we can see the quiz answers?
01:17:01	Annalisa Watson (she/her):	Oh thank you! I didn’t realize u could click on the answers
01:18:26	Phoenix Ding:	CS_data_sub <- CS_data_log %>% filter(Income_Group %in% c("Low income", "Lower middle income", "Upper middle income"))
CS_data_sub
01:19:21	Olufunke Fasawe:	Will that be the select function?
01:19:25	Silvana Larrea:	I did this, but R tells me that the object Income_Group is not found
01:19:25	Silvana Larrea:	CS_data_sub <- CS_data_log %>% 
  filter (Income_Group %in% c("Low income", "Lower middle income", "Upper middle income"))
01:19:31	Chitra Nambiar:	group_by and filter?
01:19:40	Momoh Bona :	filter
01:20:36	Stacy (Seohyun) Ahn:	im having same issue as Silvana
01:20:43	Hiruni Jayasekera (she/her):	same here
01:21:09	Annalisa Watson (she/her):	I think that happened to me when I didn’t uppercase the first letter of the variable names
01:21:13	Aliza Adler:	Same with me! And I know Income Group is a variable because I can see it in the CS_data_log
01:21:50	Aliza Adler:	I have it in the right case but it still says not found
01:21:55	Stacy (Seohyun) Ahn:	yea same
01:21:57	Hiruni Jayasekera (she/her):	my variable names are capitalized :/ you mean like “Low income”, “Lower middle income”, etc
01:22:26	Momoh Bona :	CS_data_log %>% filter(Income_Group %in% c("Low income", "Lower middle income", "Upper middle income"))
01:23:20	Hiruni Jayasekera (she/her):	ooh that worked
01:23:21	Stacy (Seohyun) Ahn:	omg finally worked thanks Kelsey ily
01:23:24	Silvana Larrea:	Yes, that worked!
01:23:25	Silvana Larrea:	Thanks!
01:23:32	Hiruni Jayasekera (she/her):	thanks!
01:30:07	Phoenix Ding:	I do not think we can get same answer if we use ==
01:31:07	Phoenix Ding:	p8 <- ggplot(CS_data_sub, aes(x = log_GDP, y = log_CS))+
  geom_point(aes(col=Income_Group))+geom_smooth()
p8
01:31:19	Momoh Bona :	CS_data_sub %>%
  ggplot(aes(x=log_GDP, y=log_CS))+
  geom_point(aes(col=Income_Group))+
  geom_smooth()
01:31:52	Ijeoma Uche:	no
01:32:14	Phoenix Ding:	I think it is just look more linear
01:34:45	Lian Hsiao:	I’m not sure why my plot has more points than the one on the screen? Tho I’m still passing the tests
01:34:51	Hiruni Jayasekera (she/her):	same here
01:34:53	Phoenix Ding:	same
01:34:53	Shannon Mohler:	Same
01:34:59	Samantha Krutzfeldt:	me too
01:35:11	Phoenix Ding:	How many objects do you have in CS_data_sub?
01:35:32	Jessica Wright:	mine only has one point
01:35:37	Ijeoma Uche:	same
01:35:37	Annalisa Watson (she/her):	Is color = the same as col = ?
01:35:39	Momoh Bona :	I have more points!
01:35:51	Lian Hsiao:	Got it, thank you!
01:35:55	Phoenix Ding:	Thank you
01:36:05	Ijeoma Uche:	I have the same code but my graph looks more cramped
01:36:10	Momoh Bona :	oh thanks!
01:36:41	Annalisa Watson (she/her):	Thx!!
01:37:12	Ijeoma Uche:	p9 <- lm(log_CS ~ log_GDP,CS_data_sub)
p9
01:41:50	Ijeoma Uche:	Slope:0.8193
01:41:59	Julia Hankin:	No idea if this is right, but: a one unit change in the number of GDP is associated with an increase of cesarean delivery rate of 0.8193.
01:48:08	Chitra Nambiar:	(-3.9405 * log(2000) )+ 0.8193
01:48:31	Phoenix Ding:	9.8?
01:51:16	Stacy (Seohyun) Ahn:	take exp
01:56:21	Ijeoma Uche:	Why is it a percent
01:57:04	Momoh Bona :	can you please review p9?
01:57:29	Momoh Bona :	I got a different value!
01:58:10	Phoenix Ding:	I say no, because 5000 is on the far right end, kind of not linear any more
01:58:40	Silvana Larrea:	It would not be appropriate, because we would be extrapolating this information, since there is no country in our data with a GDP of 50,000 and we could get back a biased result. Looking at the data in the full dataset, I think that the linear model would over-predict the cesarean rate
01:59:16	Annalisa Watson (she/her):	How can you tell it’s outside the range of our data? From the graph or the data set?
02:02:20	Ijeoma Uche:	So would it over predict?
02:02:43	Shannon Mohler:	I have to head out, thank you so much!
02:03:31	Chitra Nambiar:	under
02:03:36	Chitra Nambiar:	sorry, over
02:03:48	Stacy (Seohyun) Ahn:	over
02:07:06	Yulan Xie:	Thank you!
02:07:09	Stacy (Seohyun) Ahn:	Thanks so much!
02:07:14	Silvana Larrea:	Thank you!
02:07:14	Aliza Adler:	Thank you!!
02:07:17	Joyce Qiao:	thanks so much!
02:07:19	Annalisa Watson (she/her):	Thank you!