Hello,
I am joining 2 tables and this is my syntax:
proc sql;
create table final1 as select a.*, b.* from main1 a left join main2 b
on a.ACCOUNT_ID = b.ACCOUNT_ID
; quit;
And I know why I get the following note: WARNING: Variable ACCOUNT_ID already exists on file WORK.FINAL1
Both a and b data sets have many columns and I don't want to list out ALL the columns of b and just eliminate ACCOUNT_ID.
I tried using DROP at various places but did not work.
What is the method with proc sql if you want to keep a lot more columns than you want to drop, when joining tables. (In this ex I want to drop only 1 but my query is in general)
Thanks!
It can be done by using a little rename/drop subterfuge:
proc sql ;
create table final1 (drop=account_id2) as
select *
from main1 left join main2 (rename=account_id=account_id2)
on account_id = account_id2
;
quit ;
Another advantage of this trick is that you don't even have to qualify the tables being joined - provided, of course, that table MAIN2 has no column named ACCOUNT_ID2. To be even more on the safe side, you can use something wild like ___AID instead of ACCOUNT_ID2 (use as many leading underscores as need be to ensure the column doesn't exist in the tables being joined).
AFAIK, with FedSQL in Viya, you can use the USING clause to achieve the effect without jumping through hoops like above:
proc sql ;
create table final1 (drop=account_id2) as
select *
from main1 left join main2
using (account_id)
;
quit ;
Documented at:
In my 9.4 it doesn't work, though, so I use the rename/drop recipe to get what I want.
Kind regards
Paul D.
Since account_id is the only field in common, you can also do:
proc sql ;
create table final1 as
select *
from main1 natural left join main2;
;
quit ;
Egad! You've just dispelled my confusion that "natural" is applicable to inner joins only. Not that I haven't tried, but in my utter stupidity, I coded "left natural" rather than "natural left", having neglected to try it in the opposite order ... duh.
That said, compared to the rename/drop or using the USING clause, a natural join creates a program-control limitation by auto-joining on all the columns with the same names and data types. Methinks it would be best if the USING clause in this sense were made applicable to all SAS SQL flavors. But I'm afraid there exists a serious obstacle, since though in Base the clause does exist, it is already used for a different purpose (related to working with external data bases).
Kind regards
Paul D.
One other option, but not as succinct as the others one here.
Use the FEEDBACK option on the PROC SQL and check the log. It will have the fully generated code.
data bmi;
set sashelp.class;
bmi = weight*703 / (height**2) ;
keep name bmi;
run;
proc sql FEEDBACK;
create table want as
select *
from sashelp.class as a
left join bmi as b
on a.name=b.name;
quit;
The FEEDBACK option puts this into the log:
78 proc sql FEEDBACK; 79 create table want as 80 select * 81 from sashelp.class as a 82 left join bmi as b 83 on a.name=b.name; NOTE: Statement transforms to: select A.Name, A.Sex, A.Age, A.Height, A.Weight, B.Name, B.bmi from SASHELP.CLASS A left outer join WORK.BMI B on A.Name = B.Name; WARNING: Variable Name already exists on file WORK.WANT. NOTE: Table WORK.WANT created, with 19 rows and 6 columns.
You can easily filter the columns now.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.