DATA Step, Macro, Functions and more

How do I write a macro with array variables to run thru a proc ?

Reply
New Contributor
Posts: 2

How do I write a macro with array variables to run thru a proc ?

I have a data set that has 50 array variables (x_1, x_2, x_3...) and each variable has around 7000 observations and I need to push each variable thru a macro that will take them thru proc univariate to create a trimmed means for each variable by a group variable (shown as AGE below) and then produce one table that has all of the amended variables in it.  Any help with how this macro would be set up would be greatly appreciated.

 

 

 

proc univariate data=DATASET  trimmed=1 .1;

   by AGE
   var X_1 X_2 ...  ;
run;

 

 

Super User
Posts: 11,343

Re: How do I write a macro with array variables to run thru a proc ?

If your variables are truly named X_1 X_2 and such then you don't need any macro.


proc univariate data=dataset trimmed=1.1;
   by age;
   var x_1 x_2 x_3 ;
output out=means mean=x_1 x_2 x_3 ;
run;

Just put your list of variables on the Var line and then after the mean.

 

Trusted Advisor
Posts: 1,117

Re: How do I write a macro with array variables to run thru a proc ?

Hello @eabbo,

 

I agree with @ballardw that you most likely won't need a macro. However, I'm not quite sure what you mean by "amended variables."

 

Given the large number of variables (times the number of BY groups), you will probably want to create a dataset containing the trimmed means.


Please note that with your specification trimmed=1 .1 you request two different types of k-times trimmed means: k=1 and k=ceil(n_i 0.1) where n_i is the size of the i-th BY group. I think you need to use ODS OUTPUT to get these into a dataset. The MEAN= option of the OUTPUT statement requests the ordinary (untrimmed) means, regardless of the TRIMMED= option in the PROC UNIVARIATE statement.

 

/* Create test data */

data have;
array x_[50];
do i=1 to 7000;
  age=51+int(i/2500);
  do j=1 to dim(x_);
    x_[j]=100+10*rannor(2718);
  end;
  output;
end;
drop i j;
run;

/* Write two types of trimmed means to dataset TMEANS */

ods listing close;
ods output TrimmedMeans=tmeans;
proc univariate data=have trimmed=1 .1;
by age;
var x:;
run;
ods listing;

 

Or do you need the trimmed data, i.e., a dataset where the smallest and largest k values per BY group are deleted or set to missing?

New Contributor
Posts: 2

Re: How do I write a macro with array variables to run thru a proc ?

Posted in reply to FreelanceReinhard

Correct, I do need the trimmed data per BY group in a data set with the outliers either deleted or set to missing.

 

 

 

Within the output for proc univariate or proc mean can you specify an output argument similar to the below that would work? Where I want the descriptive statistics for all the variables within my array (x_1, x_2, x_3.... x_56) but I want a seperate table for each variable.

 

proc mean data= ;

by age;

var x_1 x_2 x_3... x_56;

output 

     IF var = x_1 THEN output=X_1 mean=average n=n lclm =lower uclm=upper ;

   IF var = x_2 THEN output=X_2 mean=average n=n lclm =lower uclm=upper ;

   IF var = x_3 THEN output=X_3 mean=average n=n lclm =lower uclm=upper ;

....

   IF var = x_56 THEN output=X_56 mean=average n=n lclm =lower uclm=upper ;

run;

Trusted Advisor
Posts: 1,117

Re: How do I write a macro with array variables to run thru a proc ?

Unfortunately, as far as I know, PROC UNIVARIATE does not provide an option to write trimmed (or Winsorized) data to a dataset, and PROC MEANS (and PROC SUMMARY) do not even compute trimmed means.

 

So, you'll have to trim your data "manually," e.g., in a data step or with SAS/IML (if available). Earlier this month, there was a similar discussion about Winsorizing variables with valuable contributions by @Rick_SAS and others. I think it should be possible to adapt code for Winsorizing data in such a way that the outliers are removed or set to missing rather than replaced by less extreme values.

 

As a first guess, I could imagine that it'll be helpful to transpose your data (at least temporarily) so that the 56 variables are turned into BY groups of a single variable. 56*7000=392000 observations are still easy to handle with SAS. Then you could trim the values within the "combined" BY groups (i.e. the new ones due to transposing within the old ones "BY age") and eventually re-transpose.

 

Just give it a try. If you experience any difficulties, please feel free to ask again. Good luck!

SAS Super FREQ
Posts: 3,752

Re: How do I write a macro with array variables to run thru a proc ?

Why do you think you need to winsorize each variable? If your answer is "because there might be outliers," I would note that there are statistical techniques that are robust to outliers, such as robust regression.

Ask a Question
Discussion stats
  • 5 replies
  • 340 views
  • 1 like
  • 4 in conversation