Hi All,
I have two data sets that I want to merge based on the values of one variable in one of the data sets. Below is the Inspector table I have and I want to merge it with the building table.
Inspector Table
Building # Inspector # Inspection_Day_I
22 1 1
22 2 4
25 1 1
25 2 1
25 3 2
25 4 2
25 5 3
Building table
Building # Inspection_Day_B Var1 Var2
22 1 S 1
22 2 1 0
22 3 0 1
22 4 1 T
25 1 S 0
25 2 1 1
25 3 S 1
I want to keep only the observations from the Building table from the Days the inspector came. So my final table will look like this:
Want
Building # Inspector # Inspection_Day_I Inspection_Day_B Var1 Var2
22 1 1 1 S 1
22 2 4 4 1 T
25 1 1 1 S 0
25 2 1 1 S 0
25 3 2 2 1 1
25 4 2 2 1 1
25 5 3 3 S 1
All the observations from the Inspection table are used, but only the selected days from the Building table are in the want data set. Rows 2 and 3 from the building data set were not merged with the Inspector table.
How do I do this?
also try merge step
data Inspector_Table(rename=(Inspection_Day_I=Inspection_Day_B));
input Building Inspector Inspection_Day_I;
cards;
22 1 1
22 2 4
25 1 1
25 2 1
25 3 2
25 4 2
25 5 3
;
proc sort data=Inspector_Table;
by Building Inspection_Day_B;
run;
data Building_table;
input Building Inspection_Day_B Var1$ Var2$;
cards;
22 1 S 1
22 2 1 0
22 3 0 1
22 4 1 T
25 1 S 0
25 2 1 1
25 3 S 1
;
proc sort data=Building_table;
by Building Inspection_Day_B;
run;
data want;
merge Inspector_Table(in=a) Building_table(in=b);
by Building Inspection_Day_b;
if a;
run;
Thanks,
Jag
Here is a solution for you:
data inspection_table;
infile cards;
input Building Inspector Inspection_Day_I;
cards;
22 1 1
22 2 4
25 1 1
25 2 1
25 3 2
25 4 2
25 5 3
;
data building_table;
infile cards;
input Building Inspection_Day_B Var1$ Var2$;
cards;
22 1 S 1
22 2 1 0
22 3 0 1
22 4 1 T
25 1 S 0
25 2 1 1
25 3 S 1
;
proc sql;
create table want as
select a.building,a.inspector,a.Inspection_Day_I,b.Inspection_Day_B,b.var1,b.var2
from inspection_table a inner join
building_table b on
a.building = b.building and
a.inspection_day_i = b.inspection_day_b;
Hi! Nice solution with SQL.
But - SQL is generally speaking a Set oriented language, when SAS (with SET and MERGE) are a sequential table oriented language.
My point: If You take this solution and run the SQL in a Data Base language, like Oracle, DB2 or Sybase - will it then still work the way you want it to ?
/ Br Anders
Joining on both variables the way it is here will get you the same result in any environment. From my perspective the datastep and sql are doing the same, but with sql you don't have to rename or sort.
also try merge step
data Inspector_Table(rename=(Inspection_Day_I=Inspection_Day_B));
input Building Inspector Inspection_Day_I;
cards;
22 1 1
22 2 4
25 1 1
25 2 1
25 3 2
25 4 2
25 5 3
;
proc sort data=Inspector_Table;
by Building Inspection_Day_B;
run;
data Building_table;
input Building Inspection_Day_B Var1$ Var2$;
cards;
22 1 S 1
22 2 1 0
22 3 0 1
22 4 1 T
25 1 S 0
25 2 1 1
25 3 S 1
;
proc sort data=Building_table;
by Building Inspection_Day_B;
run;
data want;
merge Inspector_Table(in=a) Building_table(in=b);
by Building Inspection_Day_b;
if a;
run;
Thanks,
Jag
Thanks, Jag!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.