Table X
| gvkey | fyear | Name | 
| 1004 | 1994 | a | 
| 1004 | 1995 | a | 
| 1004 | 1996 | b | 
| 1004 | 1997 | b | 
Table Y
| gvkey | fyear | Name | 
| 1004 | 1992 | a | 
| 1004 | 1993 | a | 
| 1004 | 1996 | e | 
| 1004 | 1997 | f | 
I want to merge the above two tables in the following way
| gvkey | fyear | Name | 
| 1004 | 1992 | a | 
| 1004 | 1993 | a | 
| 1004 | 1994 | a | 
| 1004 | 1995 | a | 
| 1004 | 1996 | b | 
| 1004 | 1997 | b | 
You can see that I want every observation from table X but from table Y I want only those observations that matches GVKEY but don't match fyear. From table Y if gvkey and year matches, I don't want those observations to be included in my result. Can anyone help me with how I can do this by using proc SQL? I tried to do that using left join but I don't get the desired result.
data X; infile cards expandtabs truncover; input gvkey fyear Name $; cards; 1004 1994 a 1004 1995 a 1004 1996 b 1004 1997 b ; data Y; infile cards expandtabs truncover; input gvkey fyear Name $; cards; 1004 1992 a 1004 1993 a 1004 1996 e 1004 1997 f ; proc sql; create table want as select * from x union all select * from y where not exists(select * from x where gvkey=y.gvkey and fyear=y.fyear) order by 1,2; quit;
data X; infile cards expandtabs truncover; input gvkey fyear Name $; cards; 1004 1994 a 1004 1995 a 1004 1996 b 1004 1997 b ; data Y; infile cards expandtabs truncover; input gvkey fyear Name $; cards; 1004 1992 a 1004 1993 a 1004 1996 e 1004 1997 f ; proc sql; create table want as select * from x union all select * from y where not exists(select * from x where gvkey=y.gvkey and fyear=y.fyear) order by 1,2; quit;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
