Hello,
I have many many tables. And I would like to have them merged together. I would like to put OVERLAP in HAVE2000 in 2000's row in HAVE1. And put OVERLAP in HAVE2001 in 2001 row in HAVE1.
data have1 ;
input
COUNTRY1 YEAR COUNTRY2 $5.;
datalines;
001 2000 101
001 2000 002
001 2001 103
002 2000 201
002 2000 202
002 2001 203
003 2001 203
003 2000 301
003 2000 302
003 2001 303
;
run;
data have2000 ;
input
COUNTRY1 COUNTRY2 OVERLAP $5.;
datalines;
001 101 1
001 002 2
001 103 1
002 201 0
002 202 1
002 203 2
003 203 1
003 301 2
003 302 0
003 303 0
;
run;
data have2001 ;
input
COUNTRY1 COUNTRY2 OVERLAP $5.;
datalines;
001 101 2
001 002 1
001 103 0
002 201 1
002 202 2
002 203 1
003 203 1
003 301 1
003 302 2
003 303 0
;
run;
Want is like this
data want ;
input
COUNTRY1 YEAR COUNTRY2 OVERLAP $5.;
datalines;
001 2000 101 1
001 2000 002 2
001 2001 103 0
002 2000 201 0
002 2000 202 1
002 2001 203 1
003 2001 203 1
003 2000 301 2
003 2000 302 0
003 2001 303 0
;
run;
Shall I use left join or something?
Thank you so much.
If you prefer SQL:
proc sql;
select have1.*, overlap
from have1 inner join have2000 on
have1.year=2000 and
have1.country1=have2000.country1 and
have1.country2=have2000.country2
union all
select have1.*, overlap
from have1 inner join have2001 on
have1.year=2001 and
have1.country1=have2001.country1 and
have1.country2=have2001.country2;
quit;
data have1 ;
input
COUNTRY1 YEAR COUNTRY2 ;
datalines;
001 2000 101
001 2000 002
001 2001 103
002 2000 201
002 2000 202
002 2001 203
003 2001 203
003 2000 301
003 2000 302
003 2001 303
;
run;
data have2000 ;
input
COUNTRY1 COUNTRY2 OVERLAP $5.;
datalines;
001 101 1
001 002 2
001 103 1
002 201 0
002 202 1
002 203 2
003 203 1
003 301 2
003 302 0
003 303 0
;
run;
data have2001 ;
input
COUNTRY1 COUNTRY2 OVERLAP $5.;
datalines;
001 101 2
001 002 1
001 103 0
002 201 1
002 202 2
002 203 1
003 203 1
003 301 1
003 302 2
003 303 0
;
run;
data want;
if 0 then set have1;
dcl hash H (multidata:'y') ;
h.definekey ('year','COUNTRY1','COUNTRY2') ;
h.definedata ('overlap') ;
h.definedone();
do until(lr);
set have2000 have2001 indsname=name end=lr;
year=input(compress(name,,'kd'),8.);
rc=h.add();
end;
lr=0;
do until(lr);
set have1;
rc=h.find();
output;
end;
stop;
drop rc;
run;
Thank you very much for the answer!
If you prefer SQL:
proc sql;
select have1.*, overlap
from have1 inner join have2000 on
have1.year=2000 and
have1.country1=have2000.country1 and
have1.country2=have2000.country2
union all
select have1.*, overlap
from have1 inner join have2001 on
have1.year=2001 and
have1.country1=have2001.country1 and
have1.country2=have2001.country2;
quit;
Thank you.
SQL is always my favorite. So far I just add a YEAR variable to and append all have&year. tables and match by year.
SQL is also working. But since I have 20 tables have&years., should I add UNION ALL 20 times in SQL?
Thank you. If there is a convenient way to repeat UNION ALL steps by year, it will be very greate.
It would be a lot simpler to concatenate your yearly datasets (while adding a YEAR variable) before joining with your main (have1) dataset on YEAR, COUNTRY1, and COUNTRY2.
Concatenate with something like:
data haveAll;
set have20: INDSNAME=dsn;
/* get year from dataset name, such as WORK.HAVE2010 */
year = input(substr(scan(dsn, 2), 5), best.);
run;
Thank you very much.
I actually did that. Here is something I use so far. It has been a while since last using....but it seems work out fine.
data haveALL ;
set have2000-have2017 indsname=dsname ;
length YEAR $4. ;
year = substr(dsname,length(dsname)-3);
run;
Great! But year is a number in have1 and created as a character variable in haveALL. You can make them more compatible with the input function.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.