Tuesday, February 5, 2008

Mike: Chapt 3, #10 and #11

Below is my do file that I created. If you copy the code and put it in notepad and then save it with a .do extention, you should be able to run the whole code at once. You will need to the change the first line so that the file knows where the dataset is stored.

use E:\BiostatsII\3.ex.funding.dta
*Questions 10 and 11*Perform influence analyses on the other covariates in the model from question 6.
generate log_dol = log( dollars)
generate log_mort = log(mort)
generate log_yrs = log(yrslost)
generate log_hosp = log(hospdays)
generate log_disabil = log(disabil)
regress log_dol log_hosp log_mort log_yrs log_disabil
predict h, leverage
predict z, rstandard
predict t, rstudent
predict deltab1, dfbeta(log_mort)
predict cook1, cooksd
list id disease h z t deltab1 cook1 if deltab1 >.5 deltab1 <-.5
predict deltab2, dfbeta(log_yrs)
predict cook2, cooksd
list id disease h z t deltab2 cook2 if deltab2 >.5 deltab2 <-.5
predict deltab3, dfbeta(log_hosp)
predict cook3, cooksd
list id disease h z t deltab3 cook3 if deltab3 >.5 deltab3 <-.5
*I didn’t remove peptic ulcers because the cook score was so very low; a cook score that warrants examination is .276 and peptic ulcers had .131 cook score.
predict deltab4, dfbeta(log_disabil)
predict cook4, cooksd
list id disease h z t deltab4 cook4 if deltab4 >.5 deltab4 <-.5
* Regress log[dollars] against log[disabil] and log[hospdays].
regress log_dol log_hosp log_disabil
editpreserveset obs 30replace id = 99 in 30replace hospdays = 1000 in 30replace disabil = 1000 in 30
replace log_hosp=log(hospdays) if id==99
replace log_disabil=log(disabil) if id==99
predict yhat1, xb
predict h1, leverage
predict std_yhat1, stdp
predict std_f1, stdf
generate cil_yhat1= yhat1 -invttail(29-2-1, .025)*std_yhat1
generate ciu_yhat1= yhat1 + invttail(29-2-1, .025)*std_yhat1
generate cil_f1 = yhat1 - invttail(29-2-1,.025)*std_f1
generate ciu_f1 = yhat1 + invttail(29-2-1,.025)*std_f1
generate pred_dollars= exp(yhat1)
generate cil_dollarsf1 = exp(cil_f1)
generate ciu_dollarsf1 = exp(ciu_f1)
generate cil_dollarsp1 = exp(cil_yhat1)
generate ciu_dollarsp1 = exp(ciu_yhat1)
list hospdays disabil pred_dollars cil_dollarsp1 ciu_dollarsp1 cil_dollarsf1 ciu_dollarsf1 if id==99

Monday, February 4, 2008

HW 3 #13

We'll see if this is right in class...

#13

generate logdollars = log(dollars)

generate logdisabil = log(disabil)

generate loghospdays = log(hospdays)

regress logdollars logdisabil loghospdays

predict h, leverage

predict z, rstandard

predict t, rstudent

predict deltatab1, dfbeta(logdisabil)

predict cook, cooksd

display invttail(26,.025)

label variable deltatab1 "Delta Beta for log[disabil]"

scatter deltatab1 t, c(i l) s(o i)

sort t

list h z t deltatab1 cook in -3/-1

regress logdollars logdisabil loghospdays if id ~= 1

display (.90955 - .8168)/.13148

Chapter 3: 7-9

Here is my code for 7-9:

Question 7:

predict z, rstandard

predict deltab1, dfbeta(logmort)

predict cook, cooksd


display invttail(22,.025)


sort t

list disease h z t deltab1 cook if abs(deltab1)>0.5

Question 8

label variable logdol "Log thousands of dollars of NIH research funds per year"

label variable logprev "Log disease prevalence rate per 1000"

label variable loghospdays "Log thousands of hospital-days"

label variable logmort "Log disease mortality rate per 1000"

label variable logyrslost "Log thousands of life-years lost"

label variable logdisbil "Log thousands of disability-adjusted life-years lost"

graph matrix logdol loghospdays logmort logyrslost logdisbil, msymbol(Oh) mcolor(gs10)

Question 9

regress logdol loghospdays logmort logyrslost logdisbil if abs(deltab1)<0.5


display (-.0028409 -.170048)/.1852302

Chapter 3, #1-3

*Question 1...
* 95% CI for Age = 1.43 +or- 1.96(.46) = (.5284, 2.3316)
* Does NOT cross zero, so Null is REJECTED.
* 95% CI for Weight = 25.9 +or- 1.96(31) = (-34.86, 86.66)
* Does cross zero, so the Null is NOT REJECTED.
* The CI of the Age factor is quite tight, so the magnitude of effect for every one unit chage in age is quite a bit. The Weight factor CI is quite large, so there has to be an extremely large change in weight to see an effect of magnitude.
* Problem 2. Explore the relationship between dollars and the other covariates listed above. Fit a model that you feel best captures this relationship.
use "(data location)\3.ex.Funding.dta", clear
regress dollars incid preval hospdays mort yrslost disabil
regress dollars incid preval hospdays mort yrslost
regress dollars preval hospdays mort yrslost
regress dollars hospdays mort yrslost
regress dollars hospdays yrslost
* Problem 3. Perform a forward stepwise linear regression of log(dollars) against the following potental covariates: log(incid), log(preval), log(hospdays), log(mort), log(yrslost) and log(disabil).
generate logdol = log( dollars)
generate logincid = log(incid)
generate logprev = log( preval)
generate loghosp = log( hospdays)
generate logmort = log(mort)
generate logyrs = log( yrslost)
generate logdis = log( disabil)
* Threshold = .1
sw regress logdol logincid logprev loghosp logmort logyrs logdis, forward pe(.1)
* Threshold = .2
sw regress logdol logincid logprev loghosp logmort logyrs logdis, pr(.2)
* The covariates selected are logdis loghosp for the cutoff = .1 and logyrs logdis loghosp logmort for the cutoff = .2

#6

I'm confused as to how to calculate the degrees of freedom. I came up with 28 diseases in our regression of logdol loghospdays logmort logyrslost logdisbil. Thus, I calculated the degrees of freedom as (28 - 5 - 1) = 22 based on the results of the regression. What did you all get for the degrees of freedom? Based on 22 as the degrees of freedom, I obtained the following results for #6:

predict yhat,xb
predict h, leverage
predict std_yhat, stdp
predict std_f, stdf
generate cil_yhat = yhat - invttail(28-5-1,.025)*std_yhat
generate ciu_yhat = yhat + invttail(28-5-1,.025)*std_yhat
generate cil_f = yhat - invttail(28-5-1,.025)*std_f
generate ciu_f = yhat + invttail(28-5-1,.025)*std_f
generate cil_sbpf = exp(cil_f)
generate ciu_sbpf = exp(ciu_f)
predict t, rstudent
display invttail(22,.025)
2.0738731
lowess t yhat, bwidth(0.2) msymbol(Oh) mcolor(gs10) clwidth(thick)yline(-2.07 0 2.07)
generate out = t > 1.96 | t < -1.96
tabulate out