Weighted Least Squares (WLS) ________________________________________________________________________ Data set: from Graybill and Iyer (1994), page 576 Goal: investigate the relationship between carbon monoxide in the air and number of automobiles for 13 cities of population > 50,000 that do not have clean air programs. Y=carbon monoxide in the air (units: parts per million) X=number of automobiles (in thousands) ________________________________________________________________________ Splus commands: No new Splus commands, just a new option in lsfit and lm: lsfit(x, y, wt=...) lm(formula, data=, weights=...) ________________________________________________________________________ carb<-read.table("/home/fac/d/jah/DATA/grayiyer/carbmon.dat",header=T) >carb Y X 1 5817 873 2 1063 109 3 2616 398 4 2018 353 5 3147 506 6 7210 1026 7 4339 862 8 5153 742 9 4450 786 10 5591 896 11 2747 377 12 3712 720 13 2354 655 #OLS model carb.lsfit<-lsfit(carb[,2],carb[,1]) ls.print(carb.lsfit) Residual Standard Error = 742.6138, Multiple R-Square = 0.8365 N = 13, F-statistic = 56.2719 on 1 and 11 df, p-value = 0 coef std.err t.stat p.value Intercept 40.5045 549.5987 0.0737 0.9426 X 5.9846 0.7978 7.5015 0.0000 #RESIDUAL PLOTS FOR OLS MODEL e.star<-ls.diag(carb.lsfit)$stud.res par(mfrow=c(2,3)) plot(carb$X,carb$Y,xlab="# of automobiles",ylab="CO levels") abline(carb.lsfit) title("Fig 1: Scatter plot with fitted OLS reg. line") plot(carb$X,e.star) abline(h=0) title("Fig 2: Residual plot for OLS") qqnorm(e.star) title("Fig 3: QQ plot for OLS") #WLS model #Note: it is suspected that the standard deviation of carbon monoxide #levels is related to the number of automobiles. Researchers have #suggested using weights equal to 1/x. carb.lsfit2<-lsfit(carb[,2],carb[,1],wt=1/carb[,2]) ls.print(carb.lsfit2) Residual Standard Error = 27.7958, Multiple R-Square = 0.8975 N = 13, F-statistic = 96.356 on 1 and 11 df, p-value = 0 coef std.err t.stat p.value Intercept 371.6209 297.5527 1.2489 0.2376 X 5.4662 0.5569 9.8161 0.0000 #Residual plots for WLS model e.star2<-ls.diag(carb.lsfit2)$stud.res plot(carb$X,carb$Y,xlab="# of automobiles",ylab="CO levels") abline(carb.lsfit2) title("Fig 4: Scatter plot with fitted WLS reg. line") plot(carb$X,e.star2) abline(h=0) title("Fig 5: Residual plot for WLS") qqnorm(e.star2) title("Fig 6: QQ plot for WLS") #THE WLS USING THE LM COMMAND carb.lm<-lm(Y~X,data=carb,weights=1/X) summary(carb.lm) Coefficients: Value Std. Error t value Pr(>|t|) (Intercept) 371.6209 297.5527 1.2489 0.2376 X 5.4662 0.5569 9.8161 0.0000 Residual standard error: 27.8 on 11 degrees of freedom Multiple R-Squared: 0.8975 F-statistic: 96.36 on 1 and 11 degrees of freedom, the p-value is 8.897e-07 **SEE CLASS DISCUSSION REGARDING THE RESIDUAL PLOTS