BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Ronein
Meteorite | Level 14

Hello,

In want to calculate statistics for all numeric variables.

There are some statistics that I don't know how to calculate for each numeric variable -

Number of rows with zero value (nr_Zero)

Number of rows with missing value (nr_Null)

Number of rows with positive value (nr_POS)

Number of rows with negative value that is not null (nr_Zero)

 

What is the way to calculate it automatically for each numeric var?


proc means data=sashelp.class sum min max q1 median q3  P1 p5 p95 P99 P75 P25 cv SUM Range N mean std  STACKODSOUTPUT maxdec=2;
var _numeric_;
ods output summary=Want1(Rename=(Variable=Var_Name));
run;


ods output nlevels=LEVELS;
proc freq data=sashelp.class nlevels;
tables _numeric_ / noprint ;
run;

Nr_Null
Nr_Zero
Nr_POS
Nr_Neg
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You could use the SIGN() function to convert your values into the four categories.  

Or you could define a format.

proc format;
 value sign low-<0 = 'Negative'
       0 = 'Zero'
       0<-high = 'Positive'
       other = 'Missing'
 ;
run;

Once you have the format you could use it with PROC FREQ for example:

proc freq data=have;
  tables _numeric_ / missing list ;
  format _numeric_ sign.;
run;

Or you could work a little harder and make a prettier output by using PROC SUMMARY and PROC TRANSPOSE.

proc summary data=have chartype completetypes;
 ways 1;
 class _numeric_ / missing preloadfmt;
 format _numeric_ sign.;
 output out=counts;
run;

data summary ;
  length name $32 ;
  set counts ;
  array vars _numeric_;
  index=indexc(_type_,'1');
  name=vname(vars[index]);
  sign=vvalue(vars[index]);
  count=_freq_;
  keep name sign count;
run;

proc transpose data=summary out=want(drop=_name_) ;
  by name notsorted;
  id sign;
  var count;
run;

Let's make some sample data:

data have;
  array vars var1-var4 ;
  do row=1 to 10;
    do col=1 to dim(vars);
      select(rand('integer',4));
        when(1) vars[col]=.;
        when(2) vars[col]=-1;
        when(3) vars[col]=0;
        when(4) vars[col]=1;
        otherwise;
      end;
    end;
    output;
  end;
  drop row col;
run;

So those two outputs would look something like this:

Tom_0-1716737631545.png

Tom_1-1716737650133.png

 

View solution in original post

3 REPLIES 3
PaigeMiller
Diamond | Level 26

Seems like this is answered in your other thread.

--
Paige Miller
sbxkoenk
SAS Super FREQ

@PaigeMiller wrote:

Seems like this is answered in your other thread.


Indeed ... and that other thread is

Solved: Wide to Long - SAS Support Communities

 

Still I have re-written the program without first transposing from wide to long.
The program is to be considered un-tested as SASHELP.CLASS is a bad choice for developing and testing the approach (no missing values, no zeros, ...).

Other remark : program goes through the full data (again) with every macro call. If your data is big data, you might want to re-write such that program goes through the data only once.

 

PROC DATASETS library=WORK NoList NoDetails memtype=DATA;
 delete want_summary_Report_: ; run;
 delete distinct_levels_:     ; run;
 delete profil                ; run;
run; QUIT;

%MACRO profil(checknm=nulls , checkop=EQ .);
data work.want_summary_Report_&checknm.(keep=nr_&checknm._age nr_&checknm._height nr_&checknm._weight);
 set sashelp.class end=last;
*array in{*}  _NUMERIC_;
 array in{3}  age height weight;
 array out{3} _TEMPORARY_ ;
/* Temporary data element values are always automatically retained, */ 
/* rather than being reset to missing at the beginning              */ 
/* of the next iteration of the DATA step.                          */
 do i=1 to dim(in);
  if _N_ = 1 then do; out(i)=0; end;
  if      i=1 and in(i) &checkop. then out(i) = out(i) + 1;
  else if i=2 and in(i) &checkop. then out(i) = out(i) + 1;
  else if i=3 and in(i) &checkop. then out(i) = out(i) + 1;
  else;
 end;
 if last then do; 
  nr_&checknm._age    = out(1); 
  nr_&checknm._height = out(2); 
  nr_&checknm._weight = out(3); 
  output;
 end;
run;
PROC DATASETS library=WORK NoList NoDetails memtype=DATA;
 modify want_summary_Report_&checknm.;
  rename nr_&checknm._age    = age;
  rename nr_&checknm._height = height;
  rename nr_&checknm._weight = weight;
run; QUIT;
%MEND  profil;

%profil(checknm=nulls , checkop=EQ .);
%profil(checknm=zeros , checkop=EQ 0);
%profil(checknm=negw0 , checkop=LT 0);
%profil(checknm=posw0 , checkop=GT 0);

proc transpose data=want_summary_Report_nulls 
                out=want_summary_Report_nulls_tp(rename=(COL1=nr_nulls)); run;
proc transpose data=want_summary_Report_zeros 
                out=want_summary_Report_zeros_tp(rename=(COL1=nr_zeros)); run;
proc transpose data=want_summary_Report_negw0 
                out=want_summary_Report_negw0_tp(rename=(COL1=nr_negw0)); run;
proc transpose data=want_summary_Report_posw0 
                out=want_summary_Report_posw0_tp(rename=(COL1=nr_posw0)); run;

data profil;
 merge work.want_summary_Report_nulls_tp
       work.want_summary_Report_zeros_tp
	   work.want_summary_Report_negw0_tp
	   work.want_summary_Report_posw0_tp;
 by _NAME_;
 label _NAME_="Variable profiled";
run;

proc freq data=sashelp.class nlevels noprint;
 tables age    / out=work.distinct_levels_age;
 tables height / out=work.distinct_levels_height;
 tables weight / out=work.distinct_levels_weight;
run;
/* end of program */

Koen

Tom
Super User Tom
Super User

You could use the SIGN() function to convert your values into the four categories.  

Or you could define a format.

proc format;
 value sign low-<0 = 'Negative'
       0 = 'Zero'
       0<-high = 'Positive'
       other = 'Missing'
 ;
run;

Once you have the format you could use it with PROC FREQ for example:

proc freq data=have;
  tables _numeric_ / missing list ;
  format _numeric_ sign.;
run;

Or you could work a little harder and make a prettier output by using PROC SUMMARY and PROC TRANSPOSE.

proc summary data=have chartype completetypes;
 ways 1;
 class _numeric_ / missing preloadfmt;
 format _numeric_ sign.;
 output out=counts;
run;

data summary ;
  length name $32 ;
  set counts ;
  array vars _numeric_;
  index=indexc(_type_,'1');
  name=vname(vars[index]);
  sign=vvalue(vars[index]);
  count=_freq_;
  keep name sign count;
run;

proc transpose data=summary out=want(drop=_name_) ;
  by name notsorted;
  id sign;
  var count;
run;

Let's make some sample data:

data have;
  array vars var1-var4 ;
  do row=1 to 10;
    do col=1 to dim(vars);
      select(rand('integer',4));
        when(1) vars[col]=.;
        when(2) vars[col]=-1;
        when(3) vars[col]=0;
        when(4) vars[col]=1;
        otherwise;
      end;
    end;
    output;
  end;
  drop row col;
run;

So those two outputs would look something like this:

Tom_0-1716737631545.png

Tom_1-1716737650133.png

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 621 views
  • 5 likes
  • 4 in conversation