Hi all,
I need to categorize many variables (var1-var9) based on the mean and SD. I would like to ask for syntax (macro?) to do these faster and more accurate.
///
var1_3cat=.;
if var1 < (mean-sd) then var1_3cat=1;
if var1 >= (mean-sd) & var1 <= (mean+sd) then var1_3cat=2;
if var1 >(mean+sd) then var1_3cat=3;
///
Thank you!
haoduonge
How are the mean and sd calculated? Are they unique to each variable or from a population value or from all of the variables?
You may not need a macro, look at proc stdize/standard to standardize the variables which is essentially what you're trying to do here.
Mean and SD are calulated from proc means and unique to each variable.
Thanks
I would simply create an informat and calculate zscores and apply the informat to that calculation. e.g.:
proc format; invalue zcat .=. low-< -1=1 1.0000000001-high=3 other=2 ; run; data have; input score; mean=6; std=2; var1_3cat=input((score-mean)/std, zcat.); cards; 2 3 4 . 5 6 7 8 9 10 ;
Art, CEO, AnalystFinder.com
@haoduonge wrote:
Mean and SD are calulated from proc means and unique to each variable.
Thanks
Proc stdize is a good option. Look at the METHOD options on the PROC statement.
It is very easy for IML code. proc iml; use sashelp.class; read all var _num_ into x[c=vname]; close; mean=mean(x); std=std(x); want=j(nrow(x),ncol(x),.); do i=1 to ncol(x); cutpoint=min(x[,i])||(mean[i]-std[i])||(mean[i]+std[i])||max(x[,i]); want[,i]=bin(x[,i],cutpoint); end; print want[c=vname]; run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.