Hello,
In want to calculate statistics for all numeric variables.
There are some statistics that I don't know how to calculate for each numeric variable -
Number of rows with zero value (nr_Zero)
Number of rows with missing value (nr_Null)
Number of rows with positive value (nr_POS)
Number of rows with negative value that is not null (nr_Zero)
What is the way to calculate it automatically for each numeric var?
proc means data=sashelp.class sum min max q1 median q3 P1 p5 p95 P99 P75 P25 cv SUM Range N mean std STACKODSOUTPUT maxdec=2;
var _numeric_;
ods output summary=Want1(Rename=(Variable=Var_Name));
run;
ods output nlevels=LEVELS;
proc freq data=sashelp.class nlevels;
tables _numeric_ / noprint ;
run;
Nr_Null
Nr_Zero
Nr_POS
Nr_Neg
You could use the SIGN() function to convert your values into the four categories.
Or you could define a format.
proc format;
value sign low-<0 = 'Negative'
0 = 'Zero'
0<-high = 'Positive'
other = 'Missing'
;
run;
Once you have the format you could use it with PROC FREQ for example:
proc freq data=have;
tables _numeric_ / missing list ;
format _numeric_ sign.;
run;
Or you could work a little harder and make a prettier output by using PROC SUMMARY and PROC TRANSPOSE.
proc summary data=have chartype completetypes;
ways 1;
class _numeric_ / missing preloadfmt;
format _numeric_ sign.;
output out=counts;
run;
data summary ;
length name $32 ;
set counts ;
array vars _numeric_;
index=indexc(_type_,'1');
name=vname(vars[index]);
sign=vvalue(vars[index]);
count=_freq_;
keep name sign count;
run;
proc transpose data=summary out=want(drop=_name_) ;
by name notsorted;
id sign;
var count;
run;
Let's make some sample data:
data have;
array vars var1-var4 ;
do row=1 to 10;
do col=1 to dim(vars);
select(rand('integer',4));
when(1) vars[col]=.;
when(2) vars[col]=-1;
when(3) vars[col]=0;
when(4) vars[col]=1;
otherwise;
end;
end;
output;
end;
drop row col;
run;
So those two outputs would look something like this:
Seems like this is answered in your other thread.
@PaigeMiller wrote:
Seems like this is answered in your other thread.
Indeed ... and that other thread is
Solved: Wide to Long - SAS Support Communities
Still I have re-written the program without first transposing from wide to long.
The program is to be considered un-tested as SASHELP.CLASS is a bad choice for developing and testing the approach (no missing values, no zeros, ...).
Other remark : program goes through the full data (again) with every macro call. If your data is big data, you might want to re-write such that program goes through the data only once.
PROC DATASETS library=WORK NoList NoDetails memtype=DATA;
delete want_summary_Report_: ; run;
delete distinct_levels_: ; run;
delete profil ; run;
run; QUIT;
%MACRO profil(checknm=nulls , checkop=EQ .);
data work.want_summary_Report_&checknm.(keep=nr_&checknm._age nr_&checknm._height nr_&checknm._weight);
set sashelp.class end=last;
*array in{*} _NUMERIC_;
array in{3} age height weight;
array out{3} _TEMPORARY_ ;
/* Temporary data element values are always automatically retained, */
/* rather than being reset to missing at the beginning */
/* of the next iteration of the DATA step. */
do i=1 to dim(in);
if _N_ = 1 then do; out(i)=0; end;
if i=1 and in(i) &checkop. then out(i) = out(i) + 1;
else if i=2 and in(i) &checkop. then out(i) = out(i) + 1;
else if i=3 and in(i) &checkop. then out(i) = out(i) + 1;
else;
end;
if last then do;
nr_&checknm._age = out(1);
nr_&checknm._height = out(2);
nr_&checknm._weight = out(3);
output;
end;
run;
PROC DATASETS library=WORK NoList NoDetails memtype=DATA;
modify want_summary_Report_&checknm.;
rename nr_&checknm._age = age;
rename nr_&checknm._height = height;
rename nr_&checknm._weight = weight;
run; QUIT;
%MEND profil;
%profil(checknm=nulls , checkop=EQ .);
%profil(checknm=zeros , checkop=EQ 0);
%profil(checknm=negw0 , checkop=LT 0);
%profil(checknm=posw0 , checkop=GT 0);
proc transpose data=want_summary_Report_nulls
out=want_summary_Report_nulls_tp(rename=(COL1=nr_nulls)); run;
proc transpose data=want_summary_Report_zeros
out=want_summary_Report_zeros_tp(rename=(COL1=nr_zeros)); run;
proc transpose data=want_summary_Report_negw0
out=want_summary_Report_negw0_tp(rename=(COL1=nr_negw0)); run;
proc transpose data=want_summary_Report_posw0
out=want_summary_Report_posw0_tp(rename=(COL1=nr_posw0)); run;
data profil;
merge work.want_summary_Report_nulls_tp
work.want_summary_Report_zeros_tp
work.want_summary_Report_negw0_tp
work.want_summary_Report_posw0_tp;
by _NAME_;
label _NAME_="Variable profiled";
run;
proc freq data=sashelp.class nlevels noprint;
tables age / out=work.distinct_levels_age;
tables height / out=work.distinct_levels_height;
tables weight / out=work.distinct_levels_weight;
run;
/* end of program */
Koen
You could use the SIGN() function to convert your values into the four categories.
Or you could define a format.
proc format;
value sign low-<0 = 'Negative'
0 = 'Zero'
0<-high = 'Positive'
other = 'Missing'
;
run;
Once you have the format you could use it with PROC FREQ for example:
proc freq data=have;
tables _numeric_ / missing list ;
format _numeric_ sign.;
run;
Or you could work a little harder and make a prettier output by using PROC SUMMARY and PROC TRANSPOSE.
proc summary data=have chartype completetypes;
ways 1;
class _numeric_ / missing preloadfmt;
format _numeric_ sign.;
output out=counts;
run;
data summary ;
length name $32 ;
set counts ;
array vars _numeric_;
index=indexc(_type_,'1');
name=vname(vars[index]);
sign=vvalue(vars[index]);
count=_freq_;
keep name sign count;
run;
proc transpose data=summary out=want(drop=_name_) ;
by name notsorted;
id sign;
var count;
run;
Let's make some sample data:
data have;
array vars var1-var4 ;
do row=1 to 10;
do col=1 to dim(vars);
select(rand('integer',4));
when(1) vars[col]=.;
when(2) vars[col]=-1;
when(3) vars[col]=0;
when(4) vars[col]=1;
otherwise;
end;
end;
output;
end;
drop row col;
run;
So those two outputs would look something like this:
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.