Hi Community:
Thanks a lot for your help. I already make big progress for my project.
But I just need the last step to finish it.
My goal is to use PPS survey function to select 125 data from 1000 data sets, and then, I selected the rest of 875 values and change one of variables from it. Then I merged this two part together to generate a new data set.
What I want to do is to repeat this function for 100 times. I get some instruction that I can use loop function, but I don't know how to apply it to such a function.
Here is the code for the first prediction,
proc surveyselect data=ORD
method=pps sampsize=125 out = randomsurveyPPS;
size b ;
run;
data randomsurveyPPS;
set randomsurveyPPS;
keep i a b;
run;
proc print data = randomsurveyPPS;
run;
proc MEANS data = randomsurveyPPS mean std;
proc print data = randomsurveyPPS ; var x ; run ;
data randomsurveyPPS ; set randomsurveyPPS ;
file takeout ;
put i b a ;
run ;
*/Generate the 875 dummy data;
proc sql;
title 'SQL Table Prepart';
create table prepart as
SELECT * FROM ORD
except
SELECT * FROM randomsurveyPPS;
quit;
proc print data = prepart(obs=100);
var i a b;
run;
data prediction;
set prepart;
by i;
if a then a = 9999;
run;
proc print data = prediction;
run;
proc sql;
title 'SQL Table COMBINED';
create table combined as
select * from randomsurveyPPS
outer union corr
select * from prediction;
quit;
proc print data = combined(obs=1000);
var i a b;
run;
Also, I got some hints to generate 125 random survey dataset by reps function. I am not sure is that can be used in loop funciton.
Here is the origin dataset and my first prediction code, Thanks in advance!
PROC SURVEYSELECT has a REP option to replicate the sample for 100 times.
Use that option to generate your 100 samples at once and then add in BY group processing to the remainder of your steps.
It's an incredibly useful feature and makes you not have to loop or get into macros at all. It's usually faster and more dynamic as well.
I'm not going to try and show it with your code because to be honest, I don't understand it at all.
proc MEANS data = randomsurveyPPS mean std;
proc print data = randomsurveyPPS ; var x ; run ;
data randomsurveyPPS ; set randomsurveyPPS ;
file takeout ;
put i b a ;
run ;
This for example doesn't make any sense to me for the process. I'd also strongly recommend against using the same name in your SET and DATA statements. If something happens you overwrite your original data set and it's harder to catch errors when you use this style of programming. I see you calculate the means but then don't do anything with that so I assume its just for display purposes, if that's the case I likely would comment that out or exclude it for 'production code' as all it does is waste time. No one is looking at the output of 100 reps in HTML and concluding anything useful from it. If you needed it, you should be saving the output to a better format.
@Gustavo8 wrote:
Hi Community:
Thanks a lot for your help. I already make big progress for my project.
But I just need the last step to finish it.
My goal is to use PPS survey function to select 125 data from 1000 data sets, and then, I selected the rest of 875 values and change one of variables from it. Then I merged this two part together to generate a new data set.
What I want to do is to repeat this function for 100 times. I get some instruction that I can use loop function, but I don't know how to apply it to such a function.
Here is the code for the first prediction,
proc surveyselect data=ORD
method=pps sampsize=125 out = randomsurveyPPS;
size b ;run;
data randomsurveyPPS;
set randomsurveyPPS;
keep i a b;
run;proc print data = randomsurveyPPS;
run;
proc MEANS data = randomsurveyPPS mean std;
proc print data = randomsurveyPPS ; var x ; run ;
data randomsurveyPPS ; set randomsurveyPPS ;
file takeout ;
put i b a ;
run ;
*/Generate the 875 dummy data;
proc sql;
title 'SQL Table Prepart';
create table prepart as
SELECT * FROM ORD
except
SELECT * FROM randomsurveyPPS;
quit;proc print data = prepart(obs=100);
var i a b;
run;
data prediction;
set prepart;
by i;
if a then a = 9999;
run;
proc print data = prediction;
run;
proc sql;
title 'SQL Table COMBINED';
create table combined as
select * from randomsurveyPPS
outer union corr
select * from prediction;
quit;proc print data = combined(obs=1000);
var i a b;
run;
Also, I got some hints to generate 125 random survey dataset by reps function. I am not sure is that can be used in loop funciton.
Here is the origin dataset and my first prediction code, Thanks in advance!
And if you really want to, for whatever reason, want to turn this into a macro here are some good references on how to do so.
UCLA introductory tutorial on macro variables and macros
https://stats.idre.ucla.edu/sas/seminars/sas-macros-introduction/
Tutorial on converting a working program to a macro
This method is pretty robust and helps prevent errors and makes it much easier to debug your code. Obviously biased, because I wrote it 🙂 https://github.com/statgeek/SAS-Tutorials/blob/master/Turning%20a%20program%20into%20a%20macro.md
Examples of common macro usage
https://communities.sas.com/t5/SAS-Communities-Library/SAS-9-4-Macro-Language-Reference-Has-a-New-Ap...
Hi, @Gustavo8 you asked this exact question before and marked an answer correct (plus there were other good answers).
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.