Hello!
In my task, I have to check if a string present in a dataset with a date (f.e april 30, 2025) is present in the dataset with the previous date (april 29, 2025). This is a dynamic task, so I think I need to use the macro sas code.
I created one dataset for each date of April (in total, 30 datasets). Then I have to check if a string is matched on the dataset of the previous day and so on for each day of April (maybe something like (df april 30, 2025) left join (df april 29, 2025) where the string is null in (df april 29, 2025)).
Do you have any idea/advice about how to do this task?
Thankss
Here could give you a start.
And you could use CALL EXECUTE to go through macro %check() with all the date of April .
data data_1apr2025; input data date9. tkt $; format data date9. ; datalines; 01APR2025 3333123 01APR2025 43333111 ; RUN; data data_2apr2025; input data date9. tkt $; format data date9. ; datalines; 02APR2025 99999999 02APR2025 43333111 02APR2025 11111111 ; RUN; %macro check(date=); proc sql; create table want_&date. as select * from data_&date. where tkt not in (select distinct tkt from data_%sysfunc(prxchange(s/^0//,1,%sysfunc(intnx(day,"&date."d,-1),date9.)) )); quit; %mend; %check(date=2apr2025)
Not sure if macros are needed. If you have a number of data sets (do you mean SAS data sets?) and they are all in one library (folder) with some sort of common naming scheme, you should be able to combine them all into one large SAS data set and then just do a loop in a DATA step.
Please describe the location(s) of the data sets, the naming scheme, and what they contain in more detail.
I have a SAS dataset like this:
and so on for all the 30 days of April....
I ned to check if in the dataset data_2apr2025 there are NOT MATCHING TKT with the dataset data_1apr2025.
this is what I did with these 2 datasets:
proc sql;
create table not_matching as select distinct a.tkt from data_2apr2025 a left join data_1apr2025b on a.tkt=b.tktwhere b.tktis null;
quit;
The output is:
So this query works if you have 2 static dataset; in my task, I need to loop the query for each day of April and comparing it with the previous day.
Any idea?
Thanks
Okay, it helps to see what you are working with. I think a macro is required here. You make things harder by using data set names that don't sort alphabetically, a data set named _20250401 for April 1 would at least sort properly, across months and within months, but maybe that doesn't even matter to produce the SAS code for this problem, but it might matter when you go ahead and try to use these datasets somehow.
%macro do_this;
%do date=%sysfunc(mdy(4,2,2025)) %to %sysfunc(mdy(4,30,2025));
%let previous_day=%eval(&date-1);
%let date1=%sysfunc(putn(&date,date9.));
%let previous_day1=%sysfunc(putn(&previous_day,date9.));
/* Remove leading zero from dates */
%if %substr(&date1,1,1)=0 %then %let date1=%substr(&date1,2);
%if %substr(&previous_day1,1,1)=0 %then %let previous_day=%substr(&previous_day1,2);
proc sql;
create table not_matching_&date1 as select distinct a.tkt
from data_&date1 a
left join data_&previous_day1 b on a.tkt=b.tkt where b.tktis null;
quit;
%end;
%mend;
%do_this
Here could give you a start.
And you could use CALL EXECUTE to go through macro %check() with all the date of April .
data data_1apr2025; input data date9. tkt $; format data date9. ; datalines; 01APR2025 3333123 01APR2025 43333111 ; RUN; data data_2apr2025; input data date9. tkt $; format data date9. ; datalines; 02APR2025 99999999 02APR2025 43333111 02APR2025 11111111 ; RUN; %macro check(date=); proc sql; create table want_&date. as select * from data_&date. where tkt not in (select distinct tkt from data_%sysfunc(prxchange(s/^0//,1,%sysfunc(intnx(day,"&date."d,-1),date9.)) )); quit; %mend; %check(date=2apr2025)
If you want to compare the values of a variable (whether it is character or numeric) between two datasets a MERGE is a good method. Make sure the data is sorted by the variable.
data data_1apr2025;
input data :date. tkt $;
format data date9. ;
datalines;
01APR2025 3333123
01APR2025 43333111
;
data data_2apr2025;
input data :date. tkt $;
format data date9. ;
datalines;
02APR2025 11111111
02APR2025 43333111
02APR2025 99999999
;
Now you can merge and use the IN= dataset option to check if the values exists in both datasets or not.
data want;
merge data_1apr2025(in=in1) data_2apr2025(in=in2);
by tkt;
if not (in1 and in2);
run;
Results:
OBS data tkt 1 02APR2025 11111111 2 01APR2025 3333123 3 02APR2025 99999999
If you don't want that second mismatch for some reason then just change the criteria.
if in2 and not in1;
But since you also have the DATE (named DATA for some reason) in the dataset perhaps it would be easier to interleave the datasets instead? Then the check for a mismatch is just whether there is more than one observation. So the IN= dataset option is not needed.
data want;
set data_1apr2025 data_2apr2025;
by tkt data;
if (first.tkt and last.tkt);
run;
Or perhaps you want to find the places where there is a gap in the appearance of TKT for one or more dates?
data data_3apr2025;
input data :date. tkt $;
format data date9. ;
datalines;
03APR2025 3333123
03APR2025 43333111
;
data want;
set data_: ;
by tkt data;
lag_data=lag(data);
format lag_data date9.;
if (not first.tkt) and (data-1 ne lag(data));
run;
Result
OBS data tkt lag_data 1 03APR2025 3333123 01APR2025
You could compare two daily datasets at a time, but that would mean processing most of the datasets twice, once as the "before" date, and once as the "after".
But if each of the datasets are sorted by TKT, then you could process all of the datasets in a single pass. Something like (I have changed the daily dataset names to DATA_20250401, DATA_20250402, ... DATA_20250430):
data want;
set data_202504: ;
by tkt descending date;
if first.tkt=0 and dif(date)^=-1 then output;
else if first.tkt=1 and date^='30apr2025'd then output;
run;
If the data are not sorted by TKT and if sorting would be expensive, then read the datasets in reverse chronological order. You could use two hash objects to hold current and next daily data (NEXTDAY in the code below). If an incoming observation has a TKT not found in the NEXTDAY object, then output it. At the end of each day, clear the NEXTDAY object and copy the CURRDAY data into it, in preparation for new current date.
data want;
set data_202504: ;
by descending date;
if _n_=1 then do;
declare hash currday();
currday.definekey('tkt');
currday.definedata('tkt','date');
currday.definedone();
declare hiter i ('currday');
declare hash nextday();
nextday.definekey('tkt');
nextday.definedata('tkt','date');
nextday.definedone();
end;
if date='30apr2025'd then do;
nextday.add();
return;
end;
currday.add();
if nextday.check()^=0 then output;
if last.date then do;
/*Replace NEXTDAY with CURRDAY hash object */
nextday.clear();
do while (i.next()=0);
nextday.add();
end;
currday.clear();
end;
run;
Note these programs assume there are no duplicate TKT values within each daily dataset.
Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.