00:13:09 Ekua-Yaaba Monkah: Good! how are you? 00:13:15 Silvana Larrea: Doing good Kelsey, thank you! 00:13:19 Christopher Patterson: hi 00:13:41 Gisselle Rosales: Going home! 00:13:45 Christopher Patterson: I'm gonna play a bunch of World of Warcraft and go hiking with my dad 00:13:55 Ekua-Yaaba Monkah: submitting my grad school app! 00:13:59 Christopher Patterson: Whitewater in Palm Springs 00:14:00 Christopher Patterson: socal 00:14:08 Ekua-Yaaba Monkah: yes! 00:15:59 Leslie Giglio: Kelsey I'm not seeing the Lab 11 in the labs folder 00:17:06 Leslie Giglio: Oops sorry, refreshed and found it 00:20:40 Ijeoma Uche: One sec lol 00:20:51 Christopher Patterson: I did lm(medv ~ nox, boston2) 00:21:35 Ijeoma Uche: Do we have to use tidy> 00:21:37 Ijeoma Uche: ?* 00:24:10 Rachel Harvill: In the not row, estimate column 00:24:11 Ijeoma Uche: Slope:-33.92 00:24:13 Annalisa Watson (she/her): The slope is the nox estimate 00:24:13 Rachel Harvill: nox* 00:24:29 Chitra Nambiar: house price decreases with inc in nox 00:26:26 Hiruni Jayasekera (she/her): does not equal 0? 00:26:28 Ijeoma Uche: Not equal to 0? 00:26:54 Rachel Harvill: We can reject the null since the p-value is < 0.05 00:26:55 Hiruni Jayasekera (she/her): reject the null because p is small 00:28:04 Christopher Patterson: I think the r^2 value is telling us how well our data fits our line 00:28:36 Silvana Larrea: It explains how much of the variation median household value is explained by NOX? 00:29:16 Hiruni Jayasekera (she/her): 0.1826 00:30:16 Hiruni Jayasekera (she/her): i would say it doesn't expalin much? 00:30:35 Ijeoma Uche: Can you explain this concept again? 00:31:16 Ijeoma Uche: Is it out of 1? Or is it a percentage? 00:31:41 Rachel Harvill: Makes sense because there are many other factors that affect home prices 00:32:10 Ekua-Yaaba Monkah: When I ran tidy(boston_lm) I keep getting an error 00:33:36 Christopher Patterson: Scatter plot ? 00:33:39 Genesis Navarrete: qq plot? 00:34:12 Annalisa Watson (she/her): Box plot 00:34:29 Christopher Patterson: fitted vs residuals 00:38:06 Hiruni Jayasekera (she/her): for some reason my plots are printing as dataframes? :/ 00:39:00 Kelsey MacCuish: hmmm did you augment the data first? 00:40:10 Hiruni Jayasekera (she/her): not for the scatter plot 00:40:30 Kelsey MacCuish: okay we’ll go over it in a second and you can check your code with mine! 00:44:25 Lian Hsiao: Can you explain again the purpose of augmenting the data and what it does? 00:44:26 Julia Hankin: Hey Kelsey, I get an error when I run that code: object '.fitted' not found 00:44:34 Julia Hankin: Any ideas where I’m going wrong? 00:47:23 Hiruni Jayasekera (she/her): my plots still are printing as dataframes :/ 00:47:37 Hiruni Jayasekera (she/her): ggplot(augment_boston, aes(sample = .resid)) + geom_qq() + geom_qq_line() 00:48:37 Christopher Patterson: What's important about the residuals being normally distributed again? 00:48:56 Anai Ramos: Kelsey, I keep getting an error saying boston_lm is not found 00:49:15 Hiruni Jayasekera (she/her): augment_boston <- augment(boston_lm) augment_boston library(ggplot2) plot1 <- ggplot(augment_boston, aes(x = nox, y = medv)) + geom_point() + labs(x = "Nitrogen oxide", y = "Median home value") + geom_smooth(method = "lm", se = F) + geom_segment(aes(xend = nox, yend= .fitted), lty = 2) 00:49:46 Hiruni Jayasekera (she/her): just kidding it just suddenly worked 00:49:49 Christopher Patterson: yes it does, thanks. 00:50:26 Silvana Larrea: plot3 <- ggplot(augmented_1, aes (y = .resid, x = .fitted)) + geom_point () + geom_hline (aes(yintercept = 0)) + theme_minimal (base_size = 15) plot3 00:50:28 Leslie Giglio: plot3 <- ggplot(augment_boston, aes(y = .resid, x = .fitted)) + geom_point() + theme_minimal(base_size = 15) + labs(y = "Residuals", x = "Fitted values", title = "(c) Fitted vs. residuals") 00:52:29 Christopher Patterson: the closer we're clustered at y=0 the better predictions we'll get from our model 00:53:47 Hiruni Jayasekera (she/her): more time would be great 00:56:28 Jessica Fields (she/her/hers): plot4 <- ggplot(reshape, aes(y = value)) + geom_boxplot(aes(fill = type)) 01:01:29 Hiruni Jayasekera (she/her): sorry what was the question? 01:01:54 Hiruni Jayasekera (she/her): is the width of the box the variation in the value? 01:02:01 Annalisa Watson (she/her): Are the residuals supposed to be short? 01:02:04 Rachel Harvill: I’m confused honestly 01:03:49 Julia Hankin: Do you mind explaining what the median of the distribution of residuals means? Or like how to interpret that value? 01:04:25 Hiruni Jayasekera (she/her): so the residuals have a lot of variation? 01:06:06 Ekua-Yaaba Monkah: Can you please show the code for the box plot again? 01:06:43 Hiruni Jayasekera (she/her): ekua i think it’s this: reshape <- augment_boston %>% dplyr::select(.resid, medv) %>% gather(key = 'type', value = 'value', medv, .resid) plot4 <- ggplot(reshape, aes(y = value)) + geom_boxplot(aes(fill = type)) 01:08:27 Ekua-Yaaba Monkah: Thank u 01:11:26 Christopher Patterson: It's showing us that the residuals are not normally distributed 01:11:26 Annalisa Watson (she/her): It is not normally distributed? 01:11:27 Diane Arnos (she/her): It’s not normal 01:11:31 Diane Arnos (she/her): right skewed? 01:11:33 Leslie Giglio: The lengths of the residuals are not normally distributed 01:12:56 Rachel Harvill: Negative linear relationship? 01:15:16 Julia Hankin: Sorry, could we briefly review #5? 01:15:21 Hiruni Jayasekera (she/her): in the first plot the points are the observed values right? 01:15:31 Julia Hankin: About meeting assumptions 01:16:25 Julia Hankin: Is there an assumption of independence that we haven’t met? 01:17:07 Julia Hankin: Oh got it! thanks 01:20:24 Annalisa Watson (she/her): p6 <- lm(medv ~ dis, boston2) 01:21:29 Silvana Larrea: For every increase in one unit of the weighted mean of distances to five Boston employment centers, there is an increase of $1,091.613 USD in the median value of owner-occupied homes. 01:23:10 Jessica Fields (she/her/hers): Is the null that the slope would be equal to 0 (no association)? 01:23:27 Diane Arnos (she/her): and the alternate is that it’s not equal to zero? 01:23:53 Annalisa Watson (she/her): Which p value do we look at? 01:24:11 Rachel Harvill: The one in the dis row I think 01:24:11 Silvana Larrea: The p-value of the slope 01:24:13 Hiruni Jayasekera (she/her): would it be the one in the same row as the slope? 01:24:18 Annalisa Watson (she/her): ty! 01:24:25 Jessica Fields (she/her/hers): Since the p value is v v small, we can conclude that under the null, there’s a very small chance of observing what we saw - so we can reject the null. right? 01:26:15 Hiruni Jayasekera (she/her): estimate +/- 1.96*se 01:27:16 Aliza Adler: Do we use the qt function ? 01:27:48 Annalisa Watson (she/her): Degree of freedom 01:27:50 Ijeoma Uche: Df? 01:27:59 Silvana Larrea: N-2? 01:28:17 Annalisa Watson (she/her): .95 01:28:18 Hiruni Jayasekera (she/her): 97.5 01:28:38 Annalisa Watson (she/her): Oh I see 01:30:11 Hiruni Jayasekera (she/her): do we need to do lower.tail = false? 01:30:31 Hiruni Jayasekera (she/her): oh right 01:30:53 Lian Hsiao: Can u clarify again why the df in this case in n-2 and not n-1?? 01:33:27 Jessica Fields (she/her/hers): For the part of the question asking about whether we’d expect the direction of the relationship to hold today - do we think about whether the CI crosses 0 or do we just think about whether we think the relationship would have changed between the 70s and now? 01:34:04 Lian Hsiao: Ok cool thank you! 01:34:48 Jessica Fields (she/her/hers): Ok thanks 01:35:13 Annalisa Watson (she/her): 1.091613 - t_star*0.1883784 01:35:33 Annalisa Watson (she/her): Yes! 01:38:34 Hiruni Jayasekera (she/her): 0 is not in the interval so we can say that in 95/100 cis we make we can reject the null? not really sure on the wording 01:39:54 Jessica Fields (she/her/hers): 0.06 01:40:30 Jessica Fields (she/her/hers): So distance to these employment centers doesn’t explain much of the variance in median home prices, right? Or am I getting mixed up? : ) 01:44:44 Annalisa Watson (she/her): I’m not sure about abline 01:44:51 Hiruni Jayasekera (she/her): same^ 01:45:38 Lian Hsiao: What does the se = F mean?? 01:46:34 Rachel Harvill: Intercept not yintercept 01:48:03 Jessica Fields (she/her/hers): For q11, is it ok that these are named the same as our earlier plots for the first half of the lab? 01:48:24 Jessica Fields (she/her/hers): Ah so we can rename them? 01:48:31 Jessica Fields (she/her/hers): Ok cool. thanks! 01:48:44 Olufunke Fasawe: Please show the code for p10 01:50:20 Ijeoma Uche: Moderate? 01:55:29 Aliza Adler: I did: ci_dataframe <- predict(p6, ci_dataframe, interval = "confidence") 01:55:29 Silvana Larrea: data1 <- data.frame(dis = 2.5) 01:55:32 Silvana Larrea: predict(lm2, data1, interval = "confidence") 01:55:41 Chitra Nambiar: anybody else having datahub issues? 01:55:46 Anai Ramos: me! 01:55:48 Michele Ko (she/her): Me! 01:55:54 Annalisa Watson (she/her): I was!! Had to restart a couple times but back in now 01:55:54 Hiruni Jayasekera (she/her): same! 01:55:55 Anai Ramos: It’s super slow 01:55:56 Lian Hsiao: Same here 01:56:04 Hiruni Jayasekera (she/her): yup just restarted 3 times and it’s working again 01:56:10 Christopher Patterson: ya same for me 01:56:16 Rachel Harvill: Same here, I had to relaunch my server and seems better now 01:57:35 Hiruni Jayasekera (she/her): what is “fit”? 02:01:39 Hiruni Jayasekera (she/her): oh ok, got it! 02:02:20 Hiruni Jayasekera (she/her): dis = 2.5 02:02:22 Silvana Larrea: Where there are more individual observations? 02:02:29 Ala Koreitem: Can I see the ggplot code again really quickly 02:06:47 Rachel Harvill: Can we go over 13 02:06:51 Ijeoma Uche: And 14 02:06:58 Aliza Adler: ^ yes can we please go over 13? 02:07:07 Olufunke Fasawe: When will the recording for this lab be uploaded to the course page please? 02:07:32 Ekua-Yaaba Monkah: What is p12? 02:08:22 Gabriela Gonzalez: @Olufunke there’s a playlist on YT to access the recording before it’s uploaded on the course website :-) you can go to a recording from a previous week and it should have the playlist on the right side of the screen under the video 02:08:28 Ala Koreitem: Yes! It’s not running for some reason 02:08:43 Ala Koreitem: Never mind it just worked thank u! 02:09:40 Hiruni Jayasekera (she/her): In 95 out of 100 CIs we construct the median home price at distance = 10 will be between $26.88k and 31.73k?