BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Illidan7
Calcite | Level 5

Hey guys! I've been trying to use proc surveyselect to perform stratified random sampling and calculate the total and average of the samples taken. I'm just starting out with SAS and the Enterprise Miner. Only been at it for a week! Anyway this is the code I have come up with so far. The sampling is from a Claims data set. Here is the code I have come up with so far. For some reason the code is not generating 10000 sample iterations which is what I need. In fact, it is only generating one! I am not getting any error messages.

 

%let nSim=10000;
%let nTot=225;
%let n1=100;
%let n2=75;
%let n3=50;
%let Seed=12345;
%MACRO sample;
proc surveyselect data=&EM_IMPORT_DATA
out=Lib.sampler
sampsize=(&n1 &n2 &n3)
seed=&Seed
rep=1;
strata Strata;
%MEND sample;
data Lib.Claims;
keep sim Claimstot Claimsavg;
array Claimsamt[&nTot];
do sim= 1 to &nSim;
Claimstot=0;
&sample;
do i= 1 to &nTot;
set Lib.sampler;
Claimsamt[i] = Claim_Dollars;
end;
do i= 1 to &nTot;
Claimstot=Claimstot + Claimsamt[i];
end;
Claimsavg = Claimstot/&nTot;
output;
end;

proc print data=Lib.Claims;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

It is more efficient to use replicate sampling than to do macro looping. In your loops, if you give the same seed to surveyselect for each iteration, you will get exactly the same sample every time.

Here is an example of replicated stratified sampling using BY processing:

 

/* Prepare sorted example data, keep only required variables */
proc sort data=sashelp.heart out=myData(keep=chol_status weight); 
where chol_status is not missing; 
by chol_status;
run;

/* Generate replicate stratified samples (chol_status has three values)*/
proc surveyselect data=myData out=mySamples
sampsize=(100 75 50)
seed=12345
rep=1000;
strata chol_status;
run;

/* Calculate statistics for each replicate sample */
proc sort data=mySamples; by replicate chol_status; run;

proc summary data=mySamples;
by replicate;
var weight;
output out=myStats mean= sum= / autoname;
run;

/* Look at the distribution of the statistics */
proc sgplot data=myStats;
histogram weight_mean;
density weight_mean;
run;
PG

View solution in original post

2 REPLIES 2
Reeza
Super User

Your output dataset doesn't have a unique name for each iteration, so whatever you do will likely get overwritten.

 

I highly recommend looking into the Don't be Loopy paper by David Cassell on simulations in SAS.

 

It also looks like you're trying to execute the macro using &sample instead of %sample, so i'm not sure your code is doing what you expect. 

 

PS. Please post code using the Code {i} or the running man button in the editor and format it for readability. This also helps with debugging. 

PGStats
Opal | Level 21

It is more efficient to use replicate sampling than to do macro looping. In your loops, if you give the same seed to surveyselect for each iteration, you will get exactly the same sample every time.

Here is an example of replicated stratified sampling using BY processing:

 

/* Prepare sorted example data, keep only required variables */
proc sort data=sashelp.heart out=myData(keep=chol_status weight); 
where chol_status is not missing; 
by chol_status;
run;

/* Generate replicate stratified samples (chol_status has three values)*/
proc surveyselect data=myData out=mySamples
sampsize=(100 75 50)
seed=12345
rep=1000;
strata chol_status;
run;

/* Calculate statistics for each replicate sample */
proc sort data=mySamples; by replicate chol_status; run;

proc summary data=mySamples;
by replicate;
var weight;
output out=myStats mean= sum= / autoname;
run;

/* Look at the distribution of the statistics */
proc sgplot data=myStats;
histogram weight_mean;
density weight_mean;
run;
PG

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1770 views
  • 1 like
  • 3 in conversation