Hey!
I've got a problem, which I barely able to solve. 😕
I have to make a table to compare datas (10 categories). I made two independent tables, which both have "IDs" (as "Kategoria"). I want to merge them (or left join them, with proc sql) by the given ID, and to keep all of the columns, to compare both.
For example there is a "A" table with a,b,c,d columns and there is an another "B" table with same columns, but different datas. I want to merge them by 'd' in the "C" - new table - so it should look like this: a,b,c, D (this is the key),a,b,c.
But as I use left join it says error "WARNING: Variable Origin already exists on file TEST.CARS_COMPARED" and 3 more errors like this.
How should I join them? Any ideas? 🙂
Here is the full code:
proc sql; create table test.cars as select * from sashelp.cars order by invoice; quit; data _null_; length maxinv 8; set test.cars end=e; retain maxinv; if _N_=1 then maxinv=invoice; if maxinv<invoice then maxinv=invoice; if e then put maxinv=; length mininv 8; set test.cars end=e; retain mininv; if _N_=1 then mininv=invoice; if mininv>invoice then mininv=invoce; if e then put mininv=; diffinv=maxinv-mininv; put diffinv; dis=diffinv/10; put dis; run; Data test.category; set test.cars; if invoice =< 9875 then category=1; if 9875 < invoice < 26243.5 then category=2; if 26243.5 < invoice < 42612 then category=3; if 42612 < invoice < 58980.5 then category=4; if 58980.5 < invoice < 75349 then category=5; if 75349 < invoice < 91717.5 then category=6; if 91717.5 < invoice < 108086 then category=7; if 108086 < invoice < 124454.5 then category=8; if 124454.5 < invoice < 140823 then category=9; if 140823 < invoice then category=10; run; proc sql; create table test.cars_EU as select Origin, MPG_City, Horsepower as Loero, Category as Kategoria from test.category group by kategoria, Origin having Origin = 'Europe' order by Origin, kategoria, Loero; quit; proc sql; create table test.cars_USA as select Origin, MPG_City, Horsepower as Loero, Category as Kategoria from test.category group by kategoria, Origin having Origin = 'USA' order by Origin, Kategoria, Loero; quit; /*HERE I WANT TO MERGE:*/ proc sql; create table test.cars_compared as select a.*, b.* from test.cars_eu as a left join test.cars_usa as b on a.kategoria = b.kategoria; quit;
ERRORS:
WARNING: Variable Origin already exists on file TEST.CARS_COMPARED.
WARNING: Variable MPG_City already exists on file TEST.CARS_COMPARED.
WARNING: Variable Loero already exists on file TEST.CARS_COMPARED.
WARNING: Variable Kategoria already exists on file TEST.CARS_COMPARED.
Thank you for your help! 🙂
Don't use *, at least for the second dataset, and give the variables new names:
proc sql;
create table test.cars_compared as
select
a.*,
b.origin as b_origin,
b.mpg_city as b_mpg_city,
b.loero as b_loero
from test.cars_eu as a left join test.cars_usa as b
on a.kategoria = b.kategoria;
quit;
The names of columns in a dataset have to be unique.
b.kategoria is omitted as it is identical to a.kategoria by definition.
Don't use *, at least for the second dataset, and give the variables new names:
proc sql;
create table test.cars_compared as
select
a.*,
b.origin as b_origin,
b.mpg_city as b_mpg_city,
b.loero as b_loero
from test.cars_eu as a left join test.cars_usa as b
on a.kategoria = b.kategoria;
quit;
The names of columns in a dataset have to be unique.
b.kategoria is omitted as it is identical to a.kategoria by definition.
And if you need to compare datasets, have a look at proc compare; there you can (and should) have identically named columns in the input datasets.
Thank you, helped a lot! 🙂
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.