BookmarkSubscribeRSS Feed
Ronein
Meteorite | Level 14

Hello

In real  situation I have 24 data sets (Names ABC with YYYMM) and 100 variables.

Here is code that calculate for each data set source and numeric variable  the following statistics:

Number of rows

Number of rows with missing value

Number of rows with Positive value

Number of rows with negative value

number of distinct values (included null value)

The problem is that I need to run on 24 months and on 100 variables.

My question-

Can you show the efficient way (less code rows) to run this macro?

Here I run it with 12 rows of macro run but I guess it can be done with one row only.

Thank you

 


Data ABC202401;
Input ID X Y Z;
cards;
1 10 20 30
2 11 21 31
3 12 31 41
4 10 15 20
5 30 40 50
6 5 10 .
7 . . -100

;
Run;

Data ABC202402;
Input ID X Y Z;
cards;
1 15 30 17
2 18 21 43
3 31 50 41
3 8 18 28
;
Run;

Data ABC202403;
Input ID X Y Z;
cards;
1 43 23 62
2 21 16 .
3 47 50 .
4 -10 -20 -30
;
Run;


Data ABC202404;
Input ID X Y Z;
cards;
1 -70 14 62
2 -15 42 45
3 27 . .
;
Run;




%macro RRR_numeric_vars(month,VAR);
proc sql;
create table _summary_ as
select  "&Var." as var,"&month." as month,
        count(*) as nr_Rows,
        sum(case when &Var.>0 then 1 else 0 end ) as nr_Rows_POS,
        sum(case when &Var.=0 then 1 else 0 end ) as nr_Rows_ZERO,
        sum(case when &Var.<0 AND &Var. ne .  then 1 else 0 end ) as nr_Rows_NEG_no_missig,
		sum(case when &Var.=. then 1 else 0 end ) as nr_Rows_Missing
from ABC&month.
;
quit;
proc append data=_summary_ base=Summary_All force;quit;
%mend RRR_numeric_vars;
/*proc delete data=Summary_All;Run;*/
%RRR_numeric_vars(month=202401,VAR=X);
%RRR_numeric_vars(month=202402,VAR=X);
%RRR_numeric_vars(month=202403,VAR=X);
%RRR_numeric_vars(month=202404,VAR=X);

%RRR_numeric_vars(month=202401,VAR=Y);
%RRR_numeric_vars(month=202402,VAR=Y);
%RRR_numeric_vars(month=202403,VAR=Y);
%RRR_numeric_vars(month=202404,VAR=Y);

%RRR_numeric_vars(month=202401,VAR=Z);
%RRR_numeric_vars(month=202402,VAR=Z);
%RRR_numeric_vars(month=202403,VAR=Z);
%RRR_numeric_vars(month=202404,VAR=Z);

proc print data=Summary_All noobs;Run;

3 REPLIES 3
Quentin
Super User

How big is your data?  Would it be feasible to concatenate your 24 datasets into one dataset, create a variable that with YYYMM, then get the counts you want using PROC FREQ or one SQL step?  Something like:

 

data abc ;
  set abc20:  indsname=dsname;
  source=dsname ;
run ;

proc format ;
  value negpos
        low - < 0 ='Negative'
        0 - high = 'Positive'
        ._-.Z='Missing' 
  ;
run ;

proc freq data=abc nlev;
  tables x y z/missing list ;
  by source ;
  format x y z negpos. ;
run ;
BASUG is hosting free webinars Next up: Mike Raithel presenting on validating data files on Wednesday July 17. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
PaigeMiller
Diamond | Level 26

No macros needed! I like it!

 

No SQL needed, I like it even more!

--
Paige Miller
ballardw
Super User

Suggest use of: TABLES _NUMERIC_ / missing list;

 

Assuming the "100" variables meant all the numeric variables. Then don't even need to know the names.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 427 views
  • 3 likes
  • 4 in conversation