DATA Step, Macro, Functions and more

Data Step/PROC SQL comparison

Accepted Solution Solved
Reply
Trusted Advisor
Posts: 1,607
Accepted Solution

Data Step/PROC SQL comparison

If I merge two data sets using a SAS data step, SAS allows me to name the merged file with the same name as on of the original data sets. I have never had a problem doing this.

Example:

data a;

     merge a b;

     by id;

run;

Note, a and b are merged to form a new dataset a.

If I try doing this in PROC SQL, for example:

proc sql;

     create table a as select a.*,b.height from a left join b on a.id=b.id;

quit;

I get a warning in the log, and I don't really understand the warning, or what risks I am taking by doing things this way. The warning in the LOG says:

WARNING: This CREATE TABLE statement recursively references the target table. A consequence of this is a possible data integrity problem.

So, what are the risks of doing this via PROC SQL? Or should I just adopt of a policy of not doing this in PROC SQL?


Accepted Solutions
Solution
‎05-23-2013 09:31 AM
Trusted Advisor
Posts: 2,113

Re: Data Step/PROC SQL comparison

Paige,

You get the message because PROC SQL may try to do the merge on the backend server.   If PROC SQL is running against SQL/Server or Oracle, for example, it will try to create sql commands native to that server and pass them over for more efficient processing.  You then run the risk of the server corrupting the data.

If all the data are SAS datasets, then the message is not very useful as both the DATA step and PROC SQL create a new temporary result data set with merged data and you are only at risk for losing the data during the instant that it is renaming the files.

Doc Muhlbaier

Duke

View solution in original post


All Replies
Solution
‎05-23-2013 09:31 AM
Trusted Advisor
Posts: 2,113

Re: Data Step/PROC SQL comparison

Paige,

You get the message because PROC SQL may try to do the merge on the backend server.   If PROC SQL is running against SQL/Server or Oracle, for example, it will try to create sql commands native to that server and pass them over for more efficient processing.  You then run the risk of the server corrupting the data.

If all the data are SAS datasets, then the message is not very useful as both the DATA step and PROC SQL create a new temporary result data set with merged data and you are only at risk for losing the data during the instant that it is renaming the files.

Doc Muhlbaier

Duke

Trusted Advisor
Posts: 1,607

Re: Data Step/PROC SQL comparison

Okay, thanks Doc, that makes sense and it eliminates one of my worries.

N/A
Posts: 1

Re: Data Step/PROC SQL comparison

Paige,

As Doc@Duke mentioned the message is not very useful as both the DATA step and PROC SQL create a new temporary result data set with merged data and could lose data during the instant that it is renaming the files.

One other thing to take note on, somewhat related, is that a PROC SQL and DATA step merge produce different results if two data sets/tables produce a many-to-many relationship (i.e. many records from one data set/table can merge to many records from the other).  That's just because a PROC SQL creates what's called a Cartisan product or every combination of record(s) to record(s) matches between the two data sets/tables.  A DATA step merge sequentially matches one record from one data set to the other without looking back (once it matches, it moves on to the next record).

Hopefully that makes sense.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 1441 views
  • 1 like
  • 3 in conversation