BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
yanshuai
Quartz | Level 8

Hello,

 

I have many many tables. And I would like to have them merged together. I would like to put OVERLAP in HAVE2000 in 2000's row in HAVE1. And put OVERLAP in HAVE2001 in 2001 row in HAVE1.

data have1 ;
input
COUNTRY1 YEAR COUNTRY2 $5.;
datalines;
001  2000  101
001  2000  002
001  2001  103
002  2000  201
002  2000  202
002  2001  203
003  2001  203
003  2000  301
003  2000  302
003  2001  303
;
run;

data have2000 ;
input
COUNTRY1 COUNTRY2 OVERLAP $5.;
datalines;
001  101  1
001  002  2
001  103  1
002  201  0
002  202  1
002  203  2
003  203  1
003  301  2
003  302  0
003  303  0
;
run;

data have2001 ;
input
COUNTRY1 COUNTRY2 OVERLAP $5.;
datalines;
001  101  2
001  002  1
001  103  0
002  201  1
002  202  2
002  203  1
003  203  1
003  301  1
003  302  2
003  303  0
;
run;

Want is like this

data want ;
input
COUNTRY1 YEAR COUNTRY2 OVERLAP $5.;
datalines;
001  2000  101  1
001  2000  002  2
001  2001  103  0
002  2000  201  0
002  2000  202  1
002  2001  203  1
003  2001  203  1
003  2000  301  2
003  2000  302  0
003  2001  303  0
;
run;

Shall I use left join or something?

 

Thank you so much.

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

If you prefer SQL:

 

proc sql;
select have1.*, overlap 
from have1 inner join have2000 on
    have1.year=2000 and 
    have1.country1=have2000.country1 and 
    have1.country2=have2000.country2
union all
select have1.*, overlap  
from have1 inner join have2001 on
    have1.year=2001 and 
    have1.country1=have2001.country1 and 
    have1.country2=have2001.country2;
quit;
PG

View solution in original post

7 REPLIES 7
novinosrin
Tourmaline | Level 20
data have1 ;
input
COUNTRY1 YEAR COUNTRY2 ;
datalines;
001  2000  101
001  2000  002
001  2001  103
002  2000  201
002  2000  202
002  2001  203
003  2001  203
003  2000  301
003  2000  302
003  2001  303
;
run;

data have2000 ;
input
COUNTRY1 COUNTRY2 OVERLAP $5.;
datalines;
001  101  1
001  002  2
001  103  1
002  201  0
002  202  1
002  203  2
003  203  1
003  301  2
003  302  0
003  303  0
;
run;

data have2001 ;
input
COUNTRY1 COUNTRY2 OVERLAP $5.;
datalines;
001  101  2
001  002  1
001  103  0
002  201  1
002  202  2
002  203  1
003  203  1
003  301  1
003  302  2
003  303  0
;
run;

data want;
if 0 then set have1;
dcl hash H (multidata:'y') ;
   h.definekey  ('year','COUNTRY1','COUNTRY2') ;
   h.definedata ('overlap') ;
   h.definedone();
do until(lr);
set have2000 have2001 indsname=name end=lr;
year=input(compress(name,,'kd'),8.);
rc=h.add();
end;
lr=0;
do until(lr);
set have1;
rc=h.find();
output;
end;
stop;
drop rc;
run;
yanshuai
Quartz | Level 8

Thank you very much for the answer!

PGStats
Opal | Level 21

If you prefer SQL:

 

proc sql;
select have1.*, overlap 
from have1 inner join have2000 on
    have1.year=2000 and 
    have1.country1=have2000.country1 and 
    have1.country2=have2000.country2
union all
select have1.*, overlap  
from have1 inner join have2001 on
    have1.year=2001 and 
    have1.country1=have2001.country1 and 
    have1.country2=have2001.country2;
quit;
PG
yanshuai
Quartz | Level 8

Thank you.

SQL is always my favorite. So far I just add a YEAR variable to and append all have&year. tables and match by year.

SQL is also working. But since I have 20 tables have&years., should I add UNION ALL 20 times in SQL?

 

Thank you. If there is a convenient way to repeat UNION ALL steps by year, it will be very greate.

PGStats
Opal | Level 21

It would be a lot simpler to concatenate your yearly datasets (while adding a YEAR variable) before joining with your main (have1) dataset on YEAR, COUNTRY1, and COUNTRY2.

 

Concatenate with something like:

 

data haveAll;

set have20: INDSNAME=dsn;

/* get year from dataset name, such as WORK.HAVE2010 */

year = input(substr(scan(dsn, 2), 5), best.);

run;

 

 

PG
yanshuai
Quartz | Level 8

Thank you very much.

I actually did that. Here is something I use so far. It has been a while since last using....but it seems work out fine.

data haveALL ;
   set have2000-have2017 indsname=dsname ;
   length YEAR $4. ;
   year = substr(dsname,length(dsname)-3);
run;
PGStats
Opal | Level 21

Great! But year is a number in have1 and created as a character variable in haveALL. You can make them more compatible with the input function.

PG

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1277 views
  • 4 likes
  • 3 in conversation