Hello Everyone,
I have a lookup dataset (X) with 2 variables i.e country and unique ID for every country. I have lot of observations in the table but I pasted only few of them.
Country UID Recode
US 100 USTATES
MEXICO 111 MEX
ARGENTINA 123 ARGEN
JAPAN 145 JAP
INDIA 160 IND
RUSSIA 125 RUSS
.
.
.
.
I have a Primary dataset (Y) with these variables
Drug Country UID
A US 100
K INDIA 160
L RUSSIA 125
C MEXICO 111
D ARGENTINA 123
M JAPAN 145
.
.
.
.
Now, by using the lookup table and UID, I need to merge the variable RECODE from look up table (X) to the primary dataset (Y). The output dataset should look like below:
Drug Country UID Recode
A US 100 USTATES
K INDIA 160 IND
L RUSSIA 125 RUSS
C MEXICO 111 MEX
D ARGENTINA 123 ARGEN
M JAPAN 145 JAP
.
.
.
.
.
I need help with SAS code on how to do this on larger dataset with 50,000 observations.
Thank you in Advance
Danwa
proc sql;
create table want as
select a.*, b.Recode from Y a
left join X b
on a.UID=b.UID;
quit;
proc sql;
create table want as
select a.*, b.Recode from Y a
left join X b
on a.UID=b.UID;
quit;
'a' and 'b' are table aliases. These are just to refer table names. 'a' is refering table Y and 'b' is refering table X.
Hello Stat_sas,
Thanks for the solution.
If I wanted to join two or more variables to the table 'Y' using UID as reference, then how would we code it. Guide me.
For example,
For dataset 'X' the variables are,
Country UID Recode Population Region
US 100 USTATES 12345 Texas
MEXICO 111 MEX 14567 International
ARGENTINA 123 ARGEN 15678 International
JAPAN 145 JAP 12345 Asia
INDIA 160 IND 14569 Asia
RUSSIA 125 RUSS 23567 Asia
For dataset "Y" the variables are
Drug Country UID
A US 100
K INDIA 160
L RUSSIA 125
C MEXICO 111
D ARGENTINA 123
M JAPAN 145
My output dataset should look like this using UID as reference:
Drug Country UID Recode Population Region
A US 100 USTATES 12345 Texas
K MEXICO 111 MEX 14567 International
L ARGENTINA 123 ARGEN 15678 International
C JAPAN 145 JAP 12345 Asia
D INDIA 160 IND 14569 Asia
M RUSSIA 125 RUSS 23567 Asia
This can be achieved by adding two more fields from X data set. As 'b' is refering table X, so if you need to add more fields from table X just add table reference (b) along with field name. Some thing like this should work
proc sql;
create table want as
select a.*, b.Recode, b.Population, b.Region from Y a
left join X b
on a.UID=b.UID;
quit;
@danwarags wrote:
Thanks for the solution stat_sas. Could you tell me what does 'a' and 'b' referred to in this code
To expand slightly on @stat_sas's response.
The code posted used a "short hand" version of referencing the aliases. A more formal code line uses AS to create that reference.
A slightly different code appearance for the same thing:
proc sql;
create table want as
select a.*, b.Recode
from Y as a left join
X as b
on a.UID=b.UID;
quit;
The aliases make more sense when the tables have names like CountryLookUpTable and CountryDrugReferenceData. Otherwise you get to type those table names a lot with some operations. Typing A. reduces errors from typing CountryLookUpTable multiple times and makes the code a bit easier to read and understand (with practice).
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.