BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Ronein
Meteorite | Level 14

Hello,

In want to calculate statistics for all numeric variables.

There are some statistics that I don't know how to calculate for each numeric variable -

Number of rows with zero value (nr_Zero)

Number of rows with missing value (nr_Null)

Number of rows with positive value (nr_POS)

Number of rows with negative value that is not null (nr_Zero)

 

What is the way to calculate it automatically for each numeric var?


proc means data=sashelp.class sum min max q1 median q3  P1 p5 p95 P99 P75 P25 cv SUM Range N mean std  STACKODSOUTPUT maxdec=2;
var _numeric_;
ods output summary=Want1(Rename=(Variable=Var_Name));
run;


ods output nlevels=LEVELS;
proc freq data=sashelp.class nlevels;
tables _numeric_ / noprint ;
run;

Nr_Null
Nr_Zero
Nr_POS
Nr_Neg
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You could use the SIGN() function to convert your values into the four categories.  

Or you could define a format.

proc format;
 value sign low-<0 = 'Negative'
       0 = 'Zero'
       0<-high = 'Positive'
       other = 'Missing'
 ;
run;

Once you have the format you could use it with PROC FREQ for example:

proc freq data=have;
  tables _numeric_ / missing list ;
  format _numeric_ sign.;
run;

Or you could work a little harder and make a prettier output by using PROC SUMMARY and PROC TRANSPOSE.

proc summary data=have chartype completetypes;
 ways 1;
 class _numeric_ / missing preloadfmt;
 format _numeric_ sign.;
 output out=counts;
run;

data summary ;
  length name $32 ;
  set counts ;
  array vars _numeric_;
  index=indexc(_type_,'1');
  name=vname(vars[index]);
  sign=vvalue(vars[index]);
  count=_freq_;
  keep name sign count;
run;

proc transpose data=summary out=want(drop=_name_) ;
  by name notsorted;
  id sign;
  var count;
run;

Let's make some sample data:

data have;
  array vars var1-var4 ;
  do row=1 to 10;
    do col=1 to dim(vars);
      select(rand('integer',4));
        when(1) vars[col]=.;
        when(2) vars[col]=-1;
        when(3) vars[col]=0;
        when(4) vars[col]=1;
        otherwise;
      end;
    end;
    output;
  end;
  drop row col;
run;

So those two outputs would look something like this:

Tom_0-1716737631545.png

Tom_1-1716737650133.png

 

View solution in original post

3 REPLIES 3
PaigeMiller
Diamond | Level 26

Seems like this is answered in your other thread.

--
Paige Miller
sbxkoenk
SAS Super FREQ

@PaigeMiller wrote:

Seems like this is answered in your other thread.


Indeed ... and that other thread is

Solved: Wide to Long - SAS Support Communities

 

Still I have re-written the program without first transposing from wide to long.
The program is to be considered un-tested as SASHELP.CLASS is a bad choice for developing and testing the approach (no missing values, no zeros, ...).

Other remark : program goes through the full data (again) with every macro call. If your data is big data, you might want to re-write such that program goes through the data only once.

 

PROC DATASETS library=WORK NoList NoDetails memtype=DATA;
 delete want_summary_Report_: ; run;
 delete distinct_levels_:     ; run;
 delete profil                ; run;
run; QUIT;

%MACRO profil(checknm=nulls , checkop=EQ .);
data work.want_summary_Report_&checknm.(keep=nr_&checknm._age nr_&checknm._height nr_&checknm._weight);
 set sashelp.class end=last;
*array in{*}  _NUMERIC_;
 array in{3}  age height weight;
 array out{3} _TEMPORARY_ ;
/* Temporary data element values are always automatically retained, */ 
/* rather than being reset to missing at the beginning              */ 
/* of the next iteration of the DATA step.                          */
 do i=1 to dim(in);
  if _N_ = 1 then do; out(i)=0; end;
  if      i=1 and in(i) &checkop. then out(i) = out(i) + 1;
  else if i=2 and in(i) &checkop. then out(i) = out(i) + 1;
  else if i=3 and in(i) &checkop. then out(i) = out(i) + 1;
  else;
 end;
 if last then do; 
  nr_&checknm._age    = out(1); 
  nr_&checknm._height = out(2); 
  nr_&checknm._weight = out(3); 
  output;
 end;
run;
PROC DATASETS library=WORK NoList NoDetails memtype=DATA;
 modify want_summary_Report_&checknm.;
  rename nr_&checknm._age    = age;
  rename nr_&checknm._height = height;
  rename nr_&checknm._weight = weight;
run; QUIT;
%MEND  profil;

%profil(checknm=nulls , checkop=EQ .);
%profil(checknm=zeros , checkop=EQ 0);
%profil(checknm=negw0 , checkop=LT 0);
%profil(checknm=posw0 , checkop=GT 0);

proc transpose data=want_summary_Report_nulls 
                out=want_summary_Report_nulls_tp(rename=(COL1=nr_nulls)); run;
proc transpose data=want_summary_Report_zeros 
                out=want_summary_Report_zeros_tp(rename=(COL1=nr_zeros)); run;
proc transpose data=want_summary_Report_negw0 
                out=want_summary_Report_negw0_tp(rename=(COL1=nr_negw0)); run;
proc transpose data=want_summary_Report_posw0 
                out=want_summary_Report_posw0_tp(rename=(COL1=nr_posw0)); run;

data profil;
 merge work.want_summary_Report_nulls_tp
       work.want_summary_Report_zeros_tp
	   work.want_summary_Report_negw0_tp
	   work.want_summary_Report_posw0_tp;
 by _NAME_;
 label _NAME_="Variable profiled";
run;

proc freq data=sashelp.class nlevels noprint;
 tables age    / out=work.distinct_levels_age;
 tables height / out=work.distinct_levels_height;
 tables weight / out=work.distinct_levels_weight;
run;
/* end of program */

Koen

Tom
Super User Tom
Super User

You could use the SIGN() function to convert your values into the four categories.  

Or you could define a format.

proc format;
 value sign low-<0 = 'Negative'
       0 = 'Zero'
       0<-high = 'Positive'
       other = 'Missing'
 ;
run;

Once you have the format you could use it with PROC FREQ for example:

proc freq data=have;
  tables _numeric_ / missing list ;
  format _numeric_ sign.;
run;

Or you could work a little harder and make a prettier output by using PROC SUMMARY and PROC TRANSPOSE.

proc summary data=have chartype completetypes;
 ways 1;
 class _numeric_ / missing preloadfmt;
 format _numeric_ sign.;
 output out=counts;
run;

data summary ;
  length name $32 ;
  set counts ;
  array vars _numeric_;
  index=indexc(_type_,'1');
  name=vname(vars[index]);
  sign=vvalue(vars[index]);
  count=_freq_;
  keep name sign count;
run;

proc transpose data=summary out=want(drop=_name_) ;
  by name notsorted;
  id sign;
  var count;
run;

Let's make some sample data:

data have;
  array vars var1-var4 ;
  do row=1 to 10;
    do col=1 to dim(vars);
      select(rand('integer',4));
        when(1) vars[col]=.;
        when(2) vars[col]=-1;
        when(3) vars[col]=0;
        when(4) vars[col]=1;
        otherwise;
      end;
    end;
    output;
  end;
  drop row col;
run;

So those two outputs would look something like this:

Tom_0-1716737631545.png

Tom_1-1716737650133.png

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 298 views
  • 5 likes
  • 4 in conversation