I have this piece of code. However, the Macro in proc univariate generate too many separate dataset due to loop t from 1 to 310. How can I modify this code to include all proc univariate output into one dataset and then modify the rest of the code for a more efficient run?
%let L=10; %* 10th percentile *;
%let H=%eval(100 - &L); %* 90th percentile*;
%let wlo=V1&L V2&L V3&L ;
%let whi=V1&H V2&H V3&H ;
%let wval=wV1 wV2 wV3 ;
%let val=V1 V2 V3;
%macro winsorise();
%do v=1 %to %sysfunc(countw(&val));
%do t=1 %to 310;
proc univariate data=regressors noprint;
var &val;
output out=_winsor&t._V&v pctlpts=&H &L
prtlpre=&val&t._V&v;
where time_count<=&t;run;
%end;
data regressors (drop=__:);
set regressors;
if _n_=1 then set _winsor&t._V&v;
&wval&t._V&v=min(max(&val&t._V&v,&wlo&t._V&v),&whi&t._V&v);
run;
%end;
%mend;
Thank you.
The code I posted already did that. Try it.
data have;
do time_count=1 to 5;
output;
end;
run;
data want;
set have;
do t=time_count to 5;
output;
end;
run;
proc sort;
by t time_count;
run;
proc freq;
tables time_count;
run;
proc print;
run;
Cumulative Cumulative time_count Frequency Percent Frequency Percent 1 5 33.33 5 33.33 2 4 26.67 9 60.00 3 3 20.00 12 80.00 4 2 13.33 14 93.33 5 1 6.67 15 100.00 time_ Obs count t 1 1 1 2 1 2 3 2 2 4 1 3 5 2 3 6 3 3 7 1 4 8 2 4 9 3 4 10 4 4 11 1 5 12 2 5 13 3 5 14 4 5 15 5 5
Why do you need a macro at all?
Why don't you just put all the variables in one VAR statement in one PROC UNIVARIATE? This produces one output data set.
Before you start worrying about writing a macro to generate code you need to figure out what actual code you want to generate. Sounds like you are trying to replicate values into multiple separate analyses. So just use multiple OUTPUT statements.
data groups;
set regressors;
do t=time_count to 310 ;
output;
en;
run;
proc univariate data=regressors noprint;
class t;
var v1;
output out=winsor_v1 pctlpts=90 10 prtlpre=v1_;
run;
Hi,
Thanks for your reply. I tried it and it is not what I want. However, this might work if I can make a new dataset that contain 310 replicated observation for t=1, 309 replicated observation for t=2; 308 replicated observation for t=3,etc.
something like this. Can you suggest how to modify your code to achieve this?
Obs t group 1 1 1 2 1 2 3 2 2 4 1 3 5 2 3 6 3 3
The code I posted already did that. Try it.
data have;
do time_count=1 to 5;
output;
end;
run;
data want;
set have;
do t=time_count to 5;
output;
end;
run;
proc sort;
by t time_count;
run;
proc freq;
tables time_count;
run;
proc print;
run;
Cumulative Cumulative time_count Frequency Percent Frequency Percent 1 5 33.33 5 33.33 2 4 26.67 9 60.00 3 3 20.00 12 80.00 4 2 13.33 14 93.33 5 1 6.67 15 100.00 time_ Obs count t 1 1 1 2 1 2 3 2 2 4 1 3 5 2 3 6 3 3 7 1 4 8 2 4 9 3 4 10 4 4 11 1 5 12 2 5 13 3 5 14 4 5 15 5 5
Thank you for your reply. However, I realise that my dataset is panel data, for each time_count, I have thousands of cross-sectional observations. So when I stack the time_count in the same dataset, SAS cannot process the explosive number of observations.
I will keep in mind your approach before I do any complicated macro.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.