DATA Step, Macro, Functions and more

Macro and Proc SQL

Reply
New Contributor
Posts: 4

Macro and Proc SQL

Hello,

For the scenario below, i am trying to use a combination of macro and sql statement to run 1 year or more of data (Daily feeds)

I have daily tables and trying to extracts the dates changes were made to specific field in a table.

Day1:5-1-2017

AccountNumber Date     Field1

123456               5-1-2017            A

123425               5-1-2017            A

123426               5-1-2017            A

123427               5-1-2017            A

Day2:5-2-2017

123456              5-2-2017            A

123425               5-2-2017            A

123426               5-2-2017            B

123427               5-2-2017            A

Day3:5-3-2017

123456              5-3-2017            C

123425               5-3-2017            A

123426               5-3-2017            B

123427               5-3-2017            A

Day4:5-4-2017

123456              5-3-2017            A

123425               5-3-2017            A

123426               5-3-2017            B

123427               5-3-2017            A

 

Result:

5-2-2017           123426               5-2-2017            B

5-3-2017            123456              5-3-2017            C

5-4-2017            123456              5-3-2017            A

 

Below is what i have so far.

 

%macro date_loop(start,end);

%let start=%sysfunc(inputn(&start,anydtdte9.));

%let end=%sysfunc(inputn(&end,anydtdte9.));

%let dif=%sysfunc(intck(day,&start,&end));

%do i=0 %to &dif;

%let date=%sysfunc(intnx(day,&start,&i,b),date9.);

%put &date;

%end;

%mend date_loop;

%date_loop(01may2017,05may2017);

proc sql;

create table xyz as

(select * from table1

where FILE_SEND_DT = &start + 1

except

select * from table1

where FILE_SEND_DT = &start);

 quit;

 

Thanks in advance for your help.

 

Trusted Advisor
Posts: 1,389

Re: Macro and Proc SQL

If each of your daily tables are sorted by id, then it is a very simple data step.  Assuming all your daily files are named DAILYxxx (where xxx is used to specify the date), then:

 

data want;
  set daily: ;
  by id date;
  if lag(field1)^=field1 and first.id=0;
run;

 

 

But your sample does not show the daily tables as sorted.  You could sort each of them and run the above program.  But sorting a year's worth of daily files can be expensive .  In that case, a hash object which carries the most recent value of FIELD1 to compare against the current version can be used:

 

data want (drop=rc);
  set daily: ;
  by date;
  if _n_=1 then do;
    if 0 then set daily1 (rename=(field1=oldfield1));
    declare hash h (dataset:'daily1 (rename=(field1=oldfield1))');
      h.definekey('id');
      h.definedata('oldfield1');
      h.definedone();
  end;
  call missing(oldfield1);
  rc=h.find();
  if rc^=0 then h.replace(key:id,data:field1);
  if rc=0 and field1^=oldfield1;
  rc=h.replace(key:id,data:field1);
run;

 

 The dataset DAILY1 is the name  I use for the earliest daily  data set.

New Contributor
Posts: 4

Re: Macro and Proc SQL

Hello,

 

Thank you for the response. Each of the daily files is 1.5-1.9 million rows, therefore, i was using PROC SQL along with a macro rather than a data step (that has to get all that data into SAS).

 

For the SAS code below, if i were to run a month worth of data, are you suggesting extracting the files, naming them daily1, daily2, daily3....daily30? and then running the below code?

 

data want (drop=rc);
  set daily: ;
  by date;
  if _n_=1 then do;
    if 0 then set daily1 (rename=(field1=oldfield1));
    declare hash h (dataset:'daily1 (rename=(field1=oldfield1))');
      h.definekey('id');
      h.definedata('oldfield1');
      h.definedone();
  end;
  call missing(oldfield1);
  rc=h.find();
  if rc^=0 then h.replace(key:id,data:field1);
  if rc=0 and field1^=oldfield1;
  rc=h.replace(key:id,data:field1);
run;

Thanks!

Trusted Advisor
Posts: 1,389

Re: Macro and Proc SQL

What do you mean "extract the files"?  I had the impression that you already have a collection of daily data sets, which means no extraction is necessary.  If so what are their names?  Use those names. 

 

But I had also assumed presumed you have a naming convention in place, like   data_20170103 data_20170104 .... data_20171231, which would permit a compact way of generating a dataset name list, as opposed to entering every dataset name.

 

For example, using the convention in the above paragraph you could request all the 2017 July datasets using the expression data_201707:.

 

What naming convention, if any do your daily datasets have?

New Contributor
Posts: 4

Re: Macro and Proc SQL

I apologize, i need to clarify the problem. The data is in a table  in the data warehouse (it has over a million rows per load date). In order to prevent extracting over 45 million rows of data per month, i wanted to use the approach below.  

 

I used a macro to create dates (within the dates mentioned), then go through 1 iteration of PROC SQL statement per date to identify field changes for entire year. I hope that clarified your question.

 

%macro date_loop(start,end);

%let start=%sysfunc(inputn(&start,anydtdte9.));

%let end=%sysfunc(inputn(&end,anydtdte9.));

%let dif=%sysfunc(intck(day,&start,&end));

%do i=0 %to &dif;

%let date=%sysfunc(intnx(day,&start,&i,b),date9.);

%put &date;

%end;

%mend date_loop;

%date_loop(01may2017,05may2017);

proc sql;

create table xyz as

(select * from table1

where FILE_SEND_DT = &start + 1

except

select * from table1

where FILE_SEND_DT = &start);

 quit;

Ask a Question
Discussion stats
  • 4 replies
  • 180 views
  • 0 likes
  • 2 in conversation