BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
LEINAARE
Obsidian | Level 7

Hello,

 

I am working with two very large datasets and would like to use PROC SQL to merge a variable (AdmDiag10) from a dataset (new.AdmDiag10_merge) onto another dataset to avoid unnecessary sort procedures.  The new.AdmDiag10_merge dataset has only five variables, the four need to merge the records and the new variable (AdmDiag10).  Since I am more comfortable using data steps, I am including below the program I would use (if I wanted to use a data step) to help communicate what I am trying to achieve with a left join.

 

/*Data step I would use to achieve desired merge*/
data medical_1217;
	merge medical_1217 (in=a)
		new.admdiag10_merge;
	by orsid recordid log file_type;
	if a = 1;
run;


/*SQL Procedure I have developed to attempt a similar join*/
proc sql;
	create table test as
	select a.*, b.*
	from medical_1217 a
	left join new.admdiag10_merge b
	on a.orsid = b.orsid
	and a.recordid = b.recordid
	and a.log = b.log
	and a.file_type = b.file_type;
quit;

 

When I run the PROC SQL program, I get warning messages for each variable used to join on.  Can someone please explain what is going on here?

 

Thanks,

 

Ted

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Your warning is coming from this bit of code:

proc sql;
	create table test as
	select a.*, b.*

The * requests ALL the variables. So since both alias A and B have at least 4 variables in common you have requested duplicates for the variables. The JOIN has nothing to do with the warning. Observe:

1730  proc sql ;
1731     create table work.junk as
1732     select name , name
1733     from sashelp.class
1734     ;
WARNING: Variable Name already exists on file WORK.JUNK.
NOTE: Table WORK.JUNK created, with 19 rows and 1 columns.

It is the mention of the same variable name.

If you do not want the warning then only list the variables from B that you want to add to A, or vice versa. Just mention a variable name once.

 

Unfortunately data steps don't do well with multiple records with the same BY variables in both sets when using Merge and won't allow multiple records in the base set if using UPDATE. So SQL is often the appropriate tool.

Of course SQL can be extremely verbose due to the syntax requirements.

 

View solution in original post

5 REPLIES 5
ballardw
Super User

Your warning is coming from this bit of code:

proc sql;
	create table test as
	select a.*, b.*

The * requests ALL the variables. So since both alias A and B have at least 4 variables in common you have requested duplicates for the variables. The JOIN has nothing to do with the warning. Observe:

1730  proc sql ;
1731     create table work.junk as
1732     select name , name
1733     from sashelp.class
1734     ;
WARNING: Variable Name already exists on file WORK.JUNK.
NOTE: Table WORK.JUNK created, with 19 rows and 1 columns.

It is the mention of the same variable name.

If you do not want the warning then only list the variables from B that you want to add to A, or vice versa. Just mention a variable name once.

 

Unfortunately data steps don't do well with multiple records with the same BY variables in both sets when using Merge and won't allow multiple records in the base set if using UPDATE. So SQL is often the appropriate tool.

Of course SQL can be extremely verbose due to the syntax requirements.

 

LEINAARE
Obsidian | Level 7
Hi @ballardw

Thanks for your response. I am not very savvy with proc SQL. Could you please explain what you mean by "If you do not want the warning then only list the variables from B that you want to add to A, or vice versa. Just mention a variable name once."
LEINAARE
Obsidian | Level 7
Does that mean that I do not need to use a SELECT statement for alias A?
ballardw
Super User

@LEINAARE wrote:
Does that mean that I do not need to use a SELECT statement for alias A?

No.

As a minimum you likely want a select like:

 

select a.*, b.othervar, b.somethingelse, b.thatvar

 

which explicitly lists the variables from B that you want that are not already in A

 

You will find SQL users that never use the * for selection because it indicates that you don't really know what you really want or need.

kalyan1234
Calcite | Level 5

select variables in right dataset to avoid notes in log
PROC SQL;
Create table exxb as
Select a.* ,b.exstdtc,extpt,medgroup as medex from xba4 as a left join exxx as b
On a.usubjid = b.usubjid and a.visit=b.visit;
Quit;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 25647 views
  • 5 likes
  • 3 in conversation