I want to build a macro that does the following data steps, but I am not sure how to use the consecutive variable list.
For example:
data compare7;
merge fileid5 fileid 6;
by vehid;
run;
proc sql;
select count(*)
from (select distinct vehicleid
from fileid7
except
select distinct vehicleid from compare7 );
quit;
data compare8;
merge compare7 fileid7;
by vehid;
run;
proc sql;
select count(*)
from (select distinct vehicleid
from fileid8
except
select distinct vehicleid from compare8 );
quit;
data compare9;
merge compare8 fileid8;
by vehid;
run;
proc sql;
select count(*)
from (select distinct vehicleid
from fileid9
except
select distinct vehicleid from compare9 );
quit;
data compare10;
merge compare9 fileid9;
by vehid;
run;
proc sql;
select count(*)
from (select distinct vehicleid
from fileid10
except
select distinct vehicleid from compare10 );
quit;
At the end of each data step, I do a PROC SQL to compare the distinct variables from the merged files and a new file (fileid8).
Please help me with developing a macro for this as I have to compare more than 100 files and it is impractical to do it manually at every step.
How do I add the consecutive variables at every step in a macro?
What is the actual problem trying to be solved here?
You seem to be trying to count number of cars added per file?
data want ;
merge fileid5 (in=in5 keep=vehicleid)
fileid6 (in=in6 keep=vehicleid)
fileid7 (in=in7 keep=vehicleid)
fileid8 (in=in8 keep=vehicleid)
fileid9 (in=in9 keep=vehicleid)
fileid10(in=in10 keep=vehicleid)
end=eof
;
by vechicleid;
if first.vehicleid;
new6 + (in6 and ^in5);
new7 + (in7 and ^in6);
new8 + (in8 and ^in7);
new9 + (in9 and ^in8);
new10+ (in10 and ^in9);
if eof;
put (new6-new10) (=/);
run;
It looks like all the numbers depend on a single number that could be a macro parameter. For example, your code would expect FIRST_FILE=5:
%macro files (first_file=);
data compare%eval(&first_file+2);
merge fileid&first_file fileid%eval(&first_file+1);
by vehid;
run;
......
%mend files;
%files (first_file=5)
That is helpful, but I am having a problem with adding the SQL code to the macro
proc sql;
select count(*)
from (select distinct vehicleid
from &first_file+2
except
select distinct vehicleid from compare%eval(&first_file+2));
quit;
Is the bolded part right? I get this error
SAS Log:
There were 683 observations read from the data set WORK.FILEID6.
NOTE: There were 621 observations read from the data set WORK.FILEID7.
NOTE: The data set WORK.COMPARE8 has 843 observations and 15 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
NOTE: Line generated by the macro variable "FIRST_FILE".
1 6
-
22
76
ERROR 22-322: Syntax error, expecting one of the following: a name, a quoted string, (, CONNECTION, DICTIONARY.
ERROR 76-322: Syntax error, statement will be ignored.
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
i believe the problem is at the underlined part, where sas is expecting a name of the dataset rather than a number, here &first_file will resolve to a number but not the name of the dataset. something like field%eval(&first_file+2)) should work well.
proc sql;
select count(*)
from (select distinct vehicleid
from &first_file+2
except
select distinct vehicleid from compare%eval(&first_file+2));
quit;
Thanks,
jagadish
Looks like I was wrong. What I wanted in the merge statement is to merge all the previous datasets. How should that be done?
For instance:
%macro files (first_file=);
data compare%eval(&first_file+2);
merge fileid&first_file fileid%eval(&first_file+1); ------- Here instead of first_file, I want everything before 8,.. That is merge all previous fileid 7,6,5,4,3,2,1
by vehid;
run;
......
%mend files;
%files (first_file=8)
Any ideas? I want all of my previous datasets in my merge statement and the new file to be compared to all the older datasets.
Inputs please
Here's how to handle that piece. All of this replaces the MERGE statement;
%local i;
merge
%do i = &first_file - 1 %to 1 %by -1;
fileid&i
%end;
;
The semicolon at the end completes the MERGE statement, and the %DO loop builds the list of data set names to be merged.
Does that include all the previous datasets until the current one I am comparing with. For some reason, I don't get the numbers I am looking for. For instance, for fileid=9, there were only 20 new vehicles added, but I get 714.
That includes all the lower numbers:
fileid8 fileid7 fileid6 fileid5 fileid4 fileid3 fileid2 fileid1
Try adding this line first, so you can see the code that gets generated:
options mprint;
Let me try again! And I already have the mprint option.
The bolded part does look correct.
The first thing to check is whether the program itself contains extra blanks. I assume that you actually have the word "fileid" in the program:
from fileid%eval(&first_file+2)
Make sure that there is no space between fileid and %eval. (Same would apply everywhere %eval appears.) If that's already in good shape, we can still find a solution but it would be marginally more complex.
What is the actual problem trying to be solved here?
You seem to be trying to count number of cars added per file?
data want ;
merge fileid5 (in=in5 keep=vehicleid)
fileid6 (in=in6 keep=vehicleid)
fileid7 (in=in7 keep=vehicleid)
fileid8 (in=in8 keep=vehicleid)
fileid9 (in=in9 keep=vehicleid)
fileid10(in=in10 keep=vehicleid)
end=eof
;
by vechicleid;
if first.vehicleid;
new6 + (in6 and ^in5);
new7 + (in7 and ^in6);
new8 + (in8 and ^in7);
new9 + (in9 and ^in8);
new10+ (in10 and ^in9);
if eof;
put (new6-new10) (=/);
run;
Tom
That is the option I would go for. Unfortunately, I have about 100 files now and more files will be coming in .
I am counting the number of unique cars added per file. There should be no overlap in the vehicle numbers in each fileid.
So, I modified the code to get the number of unique new cars added at every step and it works. It is a longer version to get the value for 100+ files, but I can live with it for now. Thanks Tom!
data want ;
merge fileid5 (in=in5 keep=vehicleid)
fileid6 (in=in6 keep=vehicleid)
fileid7 (in=in7 keep=vehicleid)
fileid8 (in=in8 keep=vehicleid)
fileid9 (in=in9 keep=vehicleid)
fileid10(in=in10 keep=vehicleid)
end=eof
;
by vehicleid;
if first.vehicleid;
new6 + (in6 and ^in5);
new7 + (in7 and ^in6 and ^in5);
new8 + (in8 and ^in7 and ^in6 and ^in5);
new9 + (in9 and ^in8 and ^in7 and ^in6 and ^in5);
new10+ (in10 and ^in9 and ^in8 and ^in7 and ^in6 and ^in5);
if eof;
put (new6-new10) (=/);
run;
Thank you very much for your time everyone.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.