Help using Base SAS procedures

DATA PROFILING in macro

Reply
Contributor
Posts: 38

DATA PROFILING in macro

[ Edited ]

Hi all,

I have a table(attached) where it has 

SYSTEM, FILENAME, TABLE_NAME, ATTRIBUTES, JOIN_TABLE,  SECOND ,  AMOUNT,  PROGRAM, LOOKUP and I have a some sub macros programming: 

%A(SYSTEMS=&SYS,TABLENAME=&OUTTABLE,PARENTTABLE=&FILE,FIRST=&FIELDS,BALANCE=&CADTOTAL) ;

%B(SYSTEMS=&SYS,TABLENAME=&OUTTABLE,) ;

%C,%D ETC........

Expected Schema for code:

%macro LOOP;
%do i=1 %to &runupto(contains number of lines in schema table) ;
proc sql noprint;
select SYSTEM, FILENAME, TABLE_NAME, ATTRIBUTES, JOIN_TABLE, SECOND , AMOUNT, PROGRAM, LOOKUP
into : SYS, : FILE , Smiley SurprisedUTTABLE, :FIELDS, : JOINTABLE, :JOINFIELD, :CADTOTAL, :CHECKER, :LOOK
from DATAVALIDATION where number = &i AND &SYSTEM='OPICS' AND &FILENAME='SECFI';
%IF &CHECKER='A'(this line should help me when ever the CHECKER is A or B or C it has to perform its own respective macros respectively)
%THEN %DO;

%CHECK(SYSTEMS=&SYS,TABLENAME=&OUTTABLE,PARENTTABLE=&FILE,FIRST=&FIELDS,BALANCE=&CADTOTAL);

%end;

%IF &CHECKER='B'
%THEN %DO;
%CHECK(SYSTEMS=&SYS,TABLENAME=&OUTTABLE,PARENTTABLE=&FILE,FIRST=&FIELDS,BALANCE=&CADTOTAL);
%END;

%IF &CHECKER='c'
%THEN %DO;

%C(SYSTEMS=&SYS,TABLENAME=&OUTTABLE,PARENTTABLE=&FILE,FIRST=&FIELDS,BALANCE=&CADTOTAL);
%END;
%END;
%MEND LOOP;

EXPLAINATION:

I want to use SCHEMA dataset and trying to check missing values  using some   macro programs to produce output according to system , where all the ATTRIBUTES of individual SYSTEM will be on one file. In my macros there is an TABLENAME which produces OUTPUT TABLE for each attribute, but instead of that i need a OUTPUT TABLE according to system by appending all the tables of individual attributes created and it should print output table for all individual attribute (i.e it should run in back end)

 

can any one tell me the code.

schema for ouput table is in OUTPUT.png file

please let me know if there are any misunderstandings.

your help is much appreciated




ouput.png
Super User
Posts: 5,260

Re: DATA PROFILING in macro

You are putting forward a whole system.
US it possible to narrow down your question to a specific problem?
Try to eliminate steps in your code that is irrelevant for your inquiry.
Data never sleeps
Contributor
Posts: 38

Re: DATA PROFILING in macro

[ Edited ]

I have some data sets which comes every month and I have to check data profiling. where I should get missing total, non missing total , missing % total,non missing% total and including this for some of the attributes I have to check resective values are valid are not by joining some other look-up tables to it.
The program for above description:


sub macro programs:
1)check
2)macro_check
3)....

4)...etc

 

DATAVALIDATION: schema table which contains all the attributes of all datasets.(please see attached file)

 

PROGRAMMING:
%macro LOOP;
%do i=1 %to &runupto ;(STEP 1)
proc sql noprint;
select SYSTEM, FILENAME, TABLE_NAME, ATTRIBUTES, JOIN_TABLE, SECOND , AMOUNT, PROGRAM, LOOKUP
into Smiley FrustratedYS, :FILE , Smiley SurprisedUTTABLE, :FIELDS, :JOINTABLE, :JOINFIELD, :CADTOTAL, :CHECKER, :LOOK
from DATAVALIDATION where NUMBER=&i AND PROGRAM='CHECK';
QUIT;
%CHECK(SYSTEMS=&SYS,PARENTTABLE=&FILE,FIRST=&FIELDS,BALANCE=&CADTOTAL);
proc append base=SOURCE data=TABLE1;
run;
%END;
%do i=1 %to &runupto ;(same STEP 1 repeated)
proc sql noprint;
select SYSTEM, FILENAME, TABLE_NAME, ATTRIBUTES, JOIN_TABLE, SECOND , AMOUNT, PROGRAM, LOOKUP
into Smiley FrustratedYS, :FILE , Smiley SurprisedUTTABLE, :FIELDS, :JOINTABLE, :JOINFIELD, :CADTOTAL, :CHECKER, :LOOK
from DATAVALIDATION where NUMBER=&i AND PROGRAM='PROGRAM_CHECK';
QUIT;
%COMPARE_CHECK(SYSTEMS=&SYS,FIRST=&FIELDS,SECOND=&JOINFIELD,BALANCE=&CADTOTAL,PARENTTABLE=&FILE,JOINTABLE=&JOINTABLE);
run;
proc append base=SOURCE data=TABLE2;
QUIT;
%END;
%MEND LOOP;
%LOOP;

my question is in the above programming every time when ever I create a new macro sub programming(i.e like check,macro_check..etc) I have to repeat --STEP 1-- again and also because of repeating this STEP1 it will start to check the data set from starting row again and again which takes lot of time.there is any way I can write MACRO for it so that looping should be done and run all sub_macros at same time and should produce output and append result to SOURCE table.


I apologize for writing big description again. I am not getting how to cut short it.

please let me know for any misunderstandings.

Thanks in advance


Untitled.png
Super User
Super User
Posts: 6,502

Re: DATA PROFILING in macro

[ Edited ]

The basic tool for aggregating data from multiple steps is PROC APPEND.

 

Forget the macro aspect of it.

Figure out the logic of the actual CODE you want to run.

Then you can figure out how to transform it into a macro.

 

So what steps do you want to run.  To make it work faster read through the data as few times as possible.It looks like your problem is that the "data validation" steps are independent of the work flow logic of this program. Did I understand that right?

In that case you probably have a very limited options for how you can iminate multiple passes through the data.  You possibly could have a class of validation code that could be run all together in the same data step or SQL select statement.  So you might be able to re-engineer your metadata to say that and build just one step for all of those checks.  But you will probably still have other steps that need to run independently since they will have multiple step boundaries.

 

 

 

Ask a Question
Discussion stats
  • 3 replies
  • 564 views
  • 0 likes
  • 3 in conversation