Hello,
Don't be Loopy - here's a full write up on doing simulations in SAS
http://www2.sas.com/proceedings/forum2007/183-2007.pdf
It's not clear what your question for us here is...can you be more specific on what exactly you need assistance with?
@mili wrote:
Hello,
I have 5 imputated datasets (imput j = 1 to 5), for each one, I'd like to do this 200 times (i = 1 to 200)
1) resample with replacement the same number of observations as in the original imputated dataset2) in this bootstrap dataset, run a cox proportional regression with backward selectionproc phreg data = boot&i;model survival * censored (1) = R S T U V W X Y Z / selection = backward;ods output parameterestimates = estim&i ;run;3) the dataset estim&i has a variable called "parameter" and each observation corresponds to the name of the one of the variable (either R S T U V W X Y Z) that was selected through backward selection in the previous set. I'd like applied these selected variables to find the corresponding Harrell concordance c-statisticdataset estim&iproc phreg data = boot&i;model survival * censored (1) = values that the variable "parameter" takes in estim.&i , here (R U W X Z);ods output concordance = cstatboot&i ;run;4) find out the Harrell concordance c-statistic in the original imputated dataset using these variablesproc phreg data = imput&j;model survival * censored (1) = values that the variable "parameter" takes in estim.&i , here (R U W X Z);ods output concordance = cstatimput&i ;run;5) the datasets cstatboot&i and cstatimput&i have a variable "estimate" and "stderr". I'd like to be able to obtain the difference between those values:estimate from cstatboot&i - estimate from cstatimput&istderr from cstatboot&i - stderr from cstatimput&i6) obtain the average of these differences (avg difference estimate +/- average difference stderr) for the 200 resamplings coming from each imputated datasetThank you for your help,Much appreciated
Post your code using the code boxes. I can't tell if the asterisks are part of your code or something the forum added, because they don't make sense to me.
So your questions are the following?
1) how to code to use the variable retained by the backward selection into
a proc phreg procedure (step #3 and 4 from my initial post) to obtain the
Harrell concordance statistics.
2) how to code to output the results of these 2 Harrell statistics to be
able to obtain the average difference between them.
Can you provide a worked example using one of the SASHELP data sets, maybe HEART or one from the docs so we can run something and help you out?
Otherwise for:
1. Use the ODS OUTPUT and ParameterEstimates table to get the variables included in the model .You can feed that to your next process by creating a macro variable list out of the variables.
2. Not sure without seeing output.
I'll move this to the statistical procedures forum where someone else may be able to help as well.
yes, these are my 2 questions! Here is a very rough attempt of coding using the Heart dataset as an example. The "??" correspond to my questions, where I do not know how to code adequately!
Thank you so much!!
/* from SASHELP data sets: HEART; let's pretend: status = 1 -> alive status = 0 -> dead survival = AgeAtDeath - AgeAtStart I'd like to do the following 200 times (i = 1 to 200) 1) resample with replacement the same number of observations as in the original imputated dataset 2) in this bootstrap dataset, run a cox proportional regression with backward selection 3) Apply the variables that were selected from the backward procedure, i.e. under variable "parameter" in the bootstrap dataset -> "&outdata.predictor&i", and find the corresponding Harrell concordance c-statistic for this model in this bootstrap and output the "estimate" and "stderr" from the Harrel c-stat added to dataset "performance", where the variable "estimate" corresponds to variable estimboot; the variable "stderr" corresponds to variable stdboot for the corresponding bootstrap variable boot (thus &i). 4) Apply the variables that were selected from the backward procedure, i.e. under variable "parameter" in the original dataset -> "&outdata.estim&i", and find the corresponding Harrell concordance c-statistic for this model in the original dataset -> &indataset., and also added to dataset "performance", where the variable "estimate" corresponds to variable estimimpute; the variable "stderr" corresponds to variable stdimput for the corresponding bootstrap variable boot (thus &i). 5) in the "performance" dataset: obtain the difference between "estimate original - estimate bootstrap" and "stderr original - stderr bootstrap" as to be able to obtain the average difference for the estimate and stderr */ %macro resample (indataset=, outdata=, reps=, size=); %do i=1 %to &reps; proc surveyselect data = &indataset. out = &outdata.&i. noprint method = urs sampsize = &size outhits ; run; proc phreg data = &outdata.&i; model survival * status (1) = sex Systolic Smoking Cholesterol Weight / selection = backward; ods output parameterestimates = &outdata.predictor&i; run; proc phreg data = &outdata.&i concordance=harrell (se); model survival * status (1) = ?? -> values that the variable parameter takes in predistor&i; ?? the parameterestimates: estimate is added to the column variable estboot under the observation line corresponding to the boostrap # (&i) in the dataset performance stderr is added to the column variable stdboot under the observation line corresponding to the boostrap # (&i) in the dataset performance run; proc phreg data = &indataset.&i concordance=harrell (se); model survival * status (1) = ?? -> values that the variable parameter takes in predistor&i; ?? the parameterestimates: estimate is added to the column variable estimpute under the observation line corresponding to the boostrap # (&i) in the dataset performance stderr is added to the column variable stdimpute under the observation line corresponding to the boostrap # (&i) in the dataset performance run; %end; %mend; %resample (indataset=sashelp.heart, outdata=work.bootstrap, reps=200,size= 5209); data summary; set performance; diff_estim = estimpute - estboot; diff_stderr = stdimpute - stdboot; run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.