Hello,
In want to calculate statistics for all numeric variables.
There are some statistics that I don't know how to calculate for each numeric variable -
Number of rows with zero value (nr_Zero)
Number of rows with missing value (nr_Null)
Number of rows with positive value (nr_POS)
Number of rows with negative value that is not null (nr_Zero)
What is the way to calculate it automatically for each numeric var?
proc means data=sashelp.class sum min max q1 median q3 P1 p5 p95 P99 P75 P25 cv SUM Range N mean std STACKODSOUTPUT maxdec=2;
var _numeric_;
ods output summary=Want1(Rename=(Variable=Var_Name));
run;
ods output nlevels=LEVELS;
proc freq data=sashelp.class nlevels;
tables _numeric_ / noprint ;
run;
Nr_Null
Nr_Zero
Nr_POS
Nr_Neg
You could use the SIGN() function to convert your values into the four categories.
Or you could define a format.
proc format;
value sign low-<0 = 'Negative'
0 = 'Zero'
0<-high = 'Positive'
other = 'Missing'
;
run;
Once you have the format you could use it with PROC FREQ for example:
proc freq data=have;
tables _numeric_ / missing list ;
format _numeric_ sign.;
run;
Or you could work a little harder and make a prettier output by using PROC SUMMARY and PROC TRANSPOSE.
proc summary data=have chartype completetypes;
ways 1;
class _numeric_ / missing preloadfmt;
format _numeric_ sign.;
output out=counts;
run;
data summary ;
length name $32 ;
set counts ;
array vars _numeric_;
index=indexc(_type_,'1');
name=vname(vars[index]);
sign=vvalue(vars[index]);
count=_freq_;
keep name sign count;
run;
proc transpose data=summary out=want(drop=_name_) ;
by name notsorted;
id sign;
var count;
run;
Let's make some sample data:
data have;
array vars var1-var4 ;
do row=1 to 10;
do col=1 to dim(vars);
select(rand('integer',4));
when(1) vars[col]=.;
when(2) vars[col]=-1;
when(3) vars[col]=0;
when(4) vars[col]=1;
otherwise;
end;
end;
output;
end;
drop row col;
run;
So those two outputs would look something like this:
Seems like this is answered in your other thread.
@PaigeMiller wrote:
Seems like this is answered in your other thread.
Indeed ... and that other thread is
Solved: Wide to Long - SAS Support Communities
Still I have re-written the program without first transposing from wide to long.
The program is to be considered un-tested as SASHELP.CLASS is a bad choice for developing and testing the approach (no missing values, no zeros, ...).
Other remark : program goes through the full data (again) with every macro call. If your data is big data, you might want to re-write such that program goes through the data only once.
PROC DATASETS library=WORK NoList NoDetails memtype=DATA;
delete want_summary_Report_: ; run;
delete distinct_levels_: ; run;
delete profil ; run;
run; QUIT;
%MACRO profil(checknm=nulls , checkop=EQ .);
data work.want_summary_Report_&checknm.(keep=nr_&checknm._age nr_&checknm._height nr_&checknm._weight);
set sashelp.class end=last;
*array in{*} _NUMERIC_;
array in{3} age height weight;
array out{3} _TEMPORARY_ ;
/* Temporary data element values are always automatically retained, */
/* rather than being reset to missing at the beginning */
/* of the next iteration of the DATA step. */
do i=1 to dim(in);
if _N_ = 1 then do; out(i)=0; end;
if i=1 and in(i) &checkop. then out(i) = out(i) + 1;
else if i=2 and in(i) &checkop. then out(i) = out(i) + 1;
else if i=3 and in(i) &checkop. then out(i) = out(i) + 1;
else;
end;
if last then do;
nr_&checknm._age = out(1);
nr_&checknm._height = out(2);
nr_&checknm._weight = out(3);
output;
end;
run;
PROC DATASETS library=WORK NoList NoDetails memtype=DATA;
modify want_summary_Report_&checknm.;
rename nr_&checknm._age = age;
rename nr_&checknm._height = height;
rename nr_&checknm._weight = weight;
run; QUIT;
%MEND profil;
%profil(checknm=nulls , checkop=EQ .);
%profil(checknm=zeros , checkop=EQ 0);
%profil(checknm=negw0 , checkop=LT 0);
%profil(checknm=posw0 , checkop=GT 0);
proc transpose data=want_summary_Report_nulls
out=want_summary_Report_nulls_tp(rename=(COL1=nr_nulls)); run;
proc transpose data=want_summary_Report_zeros
out=want_summary_Report_zeros_tp(rename=(COL1=nr_zeros)); run;
proc transpose data=want_summary_Report_negw0
out=want_summary_Report_negw0_tp(rename=(COL1=nr_negw0)); run;
proc transpose data=want_summary_Report_posw0
out=want_summary_Report_posw0_tp(rename=(COL1=nr_posw0)); run;
data profil;
merge work.want_summary_Report_nulls_tp
work.want_summary_Report_zeros_tp
work.want_summary_Report_negw0_tp
work.want_summary_Report_posw0_tp;
by _NAME_;
label _NAME_="Variable profiled";
run;
proc freq data=sashelp.class nlevels noprint;
tables age / out=work.distinct_levels_age;
tables height / out=work.distinct_levels_height;
tables weight / out=work.distinct_levels_weight;
run;
/* end of program */
Koen
You could use the SIGN() function to convert your values into the four categories.
Or you could define a format.
proc format;
value sign low-<0 = 'Negative'
0 = 'Zero'
0<-high = 'Positive'
other = 'Missing'
;
run;
Once you have the format you could use it with PROC FREQ for example:
proc freq data=have;
tables _numeric_ / missing list ;
format _numeric_ sign.;
run;
Or you could work a little harder and make a prettier output by using PROC SUMMARY and PROC TRANSPOSE.
proc summary data=have chartype completetypes;
ways 1;
class _numeric_ / missing preloadfmt;
format _numeric_ sign.;
output out=counts;
run;
data summary ;
length name $32 ;
set counts ;
array vars _numeric_;
index=indexc(_type_,'1');
name=vname(vars[index]);
sign=vvalue(vars[index]);
count=_freq_;
keep name sign count;
run;
proc transpose data=summary out=want(drop=_name_) ;
by name notsorted;
id sign;
var count;
run;
Let's make some sample data:
data have;
array vars var1-var4 ;
do row=1 to 10;
do col=1 to dim(vars);
select(rand('integer',4));
when(1) vars[col]=.;
when(2) vars[col]=-1;
when(3) vars[col]=0;
when(4) vars[col]=1;
otherwise;
end;
end;
output;
end;
drop row col;
run;
So those two outputs would look something like this:
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.