I am trying to do logistic regression bootstrap case resampling on my data which contains sample size of 71. Case/control study with four binary independent variables. I have the following code and output but unsure about two things. - Do I need to have "freq NumberHits" in step 3 if I use "outhits" in step 2? I am unsure what the freq step does in the logistic regression. - Is the output for the 95% CI the Estimated Coefficients which I would then exponentiate to obtain the odds ratio interval? /* 1. compute the statistics on the original data */ proc logistic data=cwd.final3; class s8a1yn(ref='1') '28cat01'n(ref='1') '23cyn'n(ref='1') dist_wild_10kmyn(ref='1') / param=reference; model casenum= '23cyn'n '28cat01'n s8a1yn dist_wild_10kmyn / firth covb; /* original estimates */ run; quit; title "Bootstrap Distribution of Estimates"; title2 "Case Resampling"; %let NumSamples = 10000; /* number of bootstrap resamples */ %let IntEst = -3.5228; /* intercept estimate under analysis of max likelihood original estimates for later visualization */ %let Est23 = 1.5088; %let Est28 = 1.7238; %let Estdist = 1.6265; %let Estimp = 2.2814; /* 2. Generate many bootstrap samples by using PROC SURVEYSELECT */ proc surveyselect data=sample seed=1234 out=BootCases(rename=(Replicate=SampleID)) method=urs /* resample with replacement */ sampsize=71 /* each bootstrap sample has N observations */ outhits /* OUTHITS use OUTHITS option to suppress the frequency var */ reps=&NumSamples; /* generate NumSamples bootstrap resamples */ run; /* 3. Compute the statistic for each bootstrap sample */ proc logistic data=BootCases outest=PEBoot noprint; by sampleID; freq NumberHits; class '23cyn'n(ref='1') '28cat01'n(ref='1') s8a1yn(ref='1') dist_wild_10kmyn(ref='1') / param=reference; model casenum= '23cyn'n '28cat01'n s8a1yn dist_wild_10kmyn / firth covb; run; quit; /* 4. Obtain the 95% CI*/ proc stdize data=PEBoot vardef=N pctlpts=2.5 97.5 PctlMtd=ORD_STAT outstat=Pctls; var Intercept '23cyn0'n '28cat010'n s8a1yn0 dist_wild_10kmyn0; run; proc report data=Pctls nowd; where _type_ =: 'P'; label _type_ = 'Confidence Limit'; columns ('Bootstrap Confidence Intervals of Estimated coefficients(B=10,000)' _ALL_); run;
... View more