Highlighted
# running a code multiple times on a random sample

2 weeks ago

I am using proc surveyselect to select a random sample of 250 people from the dataset called baseline_followup. I then modify the new random sample to assign everyone a value of 0 for the Ycat variable. I merge the new dataset i named control2 with existing dataset called case. I then go ahead to find the odds ratio of the variables xcat and ycat.

is there a way to run this same code 100 times and find the average odds ratio? because surveyselect will always randomly selected different people. So I want to calculate the 100 different odds ratios and find the average.

Any help is appreciated!

below is the code

proc surveyselect data=baseline_followup

out=control2

method= srs

sampsize=250;

run;

data control2;

modify control2;

Ycat=0;

run;

data casecontrol2;

set case control2;

run;

proc freq data=casecontrol2 order=formatted;

table xcat*ycat/relrisk;

run;

Solution

Thursday

Thursday

All Replies

2 weeks ago

can you not generate eg 1000 random samples and then use a by statement (by samp) in the proc freq?

2 weeks ago

Thank you @PaulBrownPhD for your reply. Yes, I can generate 100 random samples. It combines all of them into one dataset. So i can run a proc freq in addition with a by statement to see the frequencies for each replicate. However, I combine the new 100 random samples with another dataset. Replicate values are missing. and so i do not know how to calculate the odds ratio of that. I will appreciate it if you can post a sample code of any method or ideas.

2 weeks ago

Use **OUTHITS** in the surveyselect statement to get separate copies of replicates. Or use a **weight numberhits;** statement in proc freq.

PG

Thursday

Thursday

Thank you all for your responses and ideas! This is how I finally solved it.

--------------------------------------------------------------------------------------

%MACRO domean;

%DO I = 1 %TO 100;

proc surveyselect data=baseline_followup

out=controlmac&i

method= srs

sampsize=250;

run;

data controlmac&i;

modify controlmac&i;

Ycat=0;

run;

data casecohort&i;

set case controlmac&i;

run;

ods trace on;

proc freq data=casecohort&i order=formatted;

table xcat*ycat/relrisk;

run;

ods trace off;

ods output RelativeRisks=Output&i;

proc freq data=casecohort&i order=formatted;

table xcat*ycat/relrisk;

run;

proc print data=Output&i noobs;

run;

%END;

%MEND domean;

%domean

data allrelrisk; set output1-output100;

run;

data onlyodds; set allrelrisk;

where studytype not like 'Co%)';

run;

proc means mean data=onlyodds;

run;