If I merge two data sets using a SAS data step, SAS allows me to name the merged file with the same name as on of the original data sets. I have never had a problem doing this.
Example:
data a;
merge a b;
by id;
run;
Note, a and b are merged to form a new dataset a.
If I try doing this in PROC SQL, for example:
proc sql;
create table a as select a.*,b.height from a left join b on a.id=b.id;
quit;
I get a warning in the log, and I don't really understand the warning, or what risks I am taking by doing things this way. The warning in the LOG says:
WARNING: This CREATE TABLE statement recursively references the target table. A consequence of this is a possible data integrity problem.
So, what are the risks of doing this via PROC SQL? Or should I just adopt of a policy of not doing this in PROC SQL?
Paige,
You get the message because PROC SQL may try to do the merge on the backend server. If PROC SQL is running against SQL/Server or Oracle, for example, it will try to create sql commands native to that server and pass them over for more efficient processing. You then run the risk of the server corrupting the data.
If all the data are SAS datasets, then the message is not very useful as both the DATA step and PROC SQL create a new temporary result data set with merged data and you are only at risk for losing the data during the instant that it is renaming the files.
Doc Muhlbaier
Duke
Paige,
You get the message because PROC SQL may try to do the merge on the backend server. If PROC SQL is running against SQL/Server or Oracle, for example, it will try to create sql commands native to that server and pass them over for more efficient processing. You then run the risk of the server corrupting the data.
If all the data are SAS datasets, then the message is not very useful as both the DATA step and PROC SQL create a new temporary result data set with merged data and you are only at risk for losing the data during the instant that it is renaming the files.
Doc Muhlbaier
Duke
Okay, thanks Doc, that makes sense and it eliminates one of my worries.
Paige,
As Doc@Duke mentioned the message is not very useful as both the DATA step and PROC SQL create a new temporary result data set with merged data and could lose data during the instant that it is renaming the files.
One other thing to take note on, somewhat related, is that a PROC SQL and DATA step merge produce different results if two data sets/tables produce a many-to-many relationship (i.e. many records from one data set/table can merge to many records from the other). That's just because a PROC SQL creates what's called a Cartisan product or every combination of record(s) to record(s) matches between the two data sets/tables. A DATA step merge sequentially matches one record from one data set to the other without looking back (once it matches, it moves on to the next record).
Hopefully that makes sense.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.