Hey!
I've got a problem, which I barely able to solve. 😕
I have to make a table to compare datas (10 categories). I made two independent tables, which both have "IDs" (as "Kategoria"). I want to merge them (or left join them, with proc sql) by the given ID, and to keep all of the columns, to compare both.
For example there is a "A" table with a,b,c,d columns and there is an another "B" table with same columns, but different datas. I want to merge them by 'd' in the "C" - new table - so it should look like this: a,b,c, D (this is the key),a,b,c.
But as I use left join it says error "WARNING: Variable Origin already exists on file TEST.CARS_COMPARED" and 3 more errors like this.
How should I join them? Any ideas? 🙂
Here is the full code:
proc sql; create table test.cars as select * from sashelp.cars order by invoice; quit; data _null_; length maxinv 8; set test.cars end=e; retain maxinv; if _N_=1 then maxinv=invoice; if maxinv<invoice then maxinv=invoice; if e then put maxinv=; length mininv 8; set test.cars end=e; retain mininv; if _N_=1 then mininv=invoice; if mininv>invoice then mininv=invoce; if e then put mininv=; diffinv=maxinv-mininv; put diffinv; dis=diffinv/10; put dis; run; Data test.category; set test.cars; if invoice =< 9875 then category=1; if 9875 < invoice < 26243.5 then category=2; if 26243.5 < invoice < 42612 then category=3; if 42612 < invoice < 58980.5 then category=4; if 58980.5 < invoice < 75349 then category=5; if 75349 < invoice < 91717.5 then category=6; if 91717.5 < invoice < 108086 then category=7; if 108086 < invoice < 124454.5 then category=8; if 124454.5 < invoice < 140823 then category=9; if 140823 < invoice then category=10; run; proc sql; create table test.cars_EU as select Origin, MPG_City, Horsepower as Loero, Category as Kategoria from test.category group by kategoria, Origin having Origin = 'Europe' order by Origin, kategoria, Loero; quit; proc sql; create table test.cars_USA as select Origin, MPG_City, Horsepower as Loero, Category as Kategoria from test.category group by kategoria, Origin having Origin = 'USA' order by Origin, Kategoria, Loero; quit; /*HERE I WANT TO MERGE:*/ proc sql; create table test.cars_compared as select a.*, b.* from test.cars_eu as a left join test.cars_usa as b on a.kategoria = b.kategoria; quit;
ERRORS:
WARNING: Variable Origin already exists on file TEST.CARS_COMPARED.
WARNING: Variable MPG_City already exists on file TEST.CARS_COMPARED.
WARNING: Variable Loero already exists on file TEST.CARS_COMPARED.
WARNING: Variable Kategoria already exists on file TEST.CARS_COMPARED.
Thank you for your help! 🙂
Don't use *, at least for the second dataset, and give the variables new names:
proc sql;
create table test.cars_compared as
select
a.*,
b.origin as b_origin,
b.mpg_city as b_mpg_city,
b.loero as b_loero
from test.cars_eu as a left join test.cars_usa as b
on a.kategoria = b.kategoria;
quit;
The names of columns in a dataset have to be unique.
b.kategoria is omitted as it is identical to a.kategoria by definition.
Don't use *, at least for the second dataset, and give the variables new names:
proc sql;
create table test.cars_compared as
select
a.*,
b.origin as b_origin,
b.mpg_city as b_mpg_city,
b.loero as b_loero
from test.cars_eu as a left join test.cars_usa as b
on a.kategoria = b.kategoria;
quit;
The names of columns in a dataset have to be unique.
b.kategoria is omitted as it is identical to a.kategoria by definition.
And if you need to compare datasets, have a look at proc compare; there you can (and should) have identically named columns in the input datasets.
Thank you, helped a lot! 🙂
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.