Obsidian | Level 7

## How to merge datasets one pairwise and one not?

Hi all,

I am very new to SAS, and I am trying to merge two datasets.  The first has pairwise comparisons between the proportion shared between 2 IDs.  The second has additional identifying information about each of these IDs.  I want the merge the two datasets to produce an output similar to what is shown below, but I am unsure where to start.  I have provided samples of the data tables below.  I am working in SAS 9.4.  I appreciate any help you can provide.

 Input 1 ID1 ID2 Pair_Num Proportion_Shared 1 2 1 0.97 1 3 2 81 1 4 3 0.73 1 5 4 0.81 2 3 5 0.56 2 4 6 0.71 2 5 7 0.31 3 4 8 0.81 3 5 9 0.66 4 5 10 0.59

 Input 2 ID Color Age 1 Blue 15 2 Green 21 3 Blue 78 4 Green 23 5 Blue 56

 Desired Output ID1 ID2 Pair_Num Proportion_Shared Colors Age_Difference 1 2 1 0.97 Blue, Green 6 1 3 2 81 Blue, Blue 63 1 4 3 0.73 Blue, Green 8 1 5 4 0.81 Blue, Blue 41 2 3 5 0.56 Green, Blue 57 2 4 6 0.71 Green, Green 2 2 5 7 0.31 Green, Blue 35 3 4 8 0.81 Blue, Green 55 3 5 9 0.66 Blue, Blue 22 4 5 10 0.59 Green, Blue 33
1 ACCEPTED SOLUTION

Accepted Solutions

## Re: How to merge datasets one pairwise and one not?

Another way:

``````data want(keep= ID1 ID2 Pair_Num Proportion_Shared Colors Age_Difference);
set have1;

length Colors \$ 50;

if _n_ = 1 then do;
if 0 then set have2;
declare hash h(dataset: 'have2');
h.defineKey('ID');
h.defineData('Color', 'Age');
h.defineDone();
call missing(Color, Age);
end;

rc = h.find(key: ID1);
Colors = Color;
_Age = Age;

rc = h.find(key: ID2);
Colors = catx(', ', Colors, Color);
Age_Difference = abs(_Age - Age);
run;``````
3 REPLIES 3
Super User

## Re: How to merge datasets one pairwise and one not?

So you have a PAIRS dataset with these variables:

 ID1 ID2 Pair_Num Proportion_Shared

And and IDS dataset with these variables:

 ID Color Age

And you want to make new dataset with these variables.

 ID1 ID2 Pair_Num Proportion_Shared Colors Age_Difference

So a simple SQL join should do the job.

``````proc sql;
create table WANT as
select a.*
, catx(',',b.color,c.color) as Colors
, b.age - c.age as Age_Difference
from PAIRS A
left join IDS B on a.id1=b.id
left join IDS C on a.id2=c.id
;
quit;
``````

## Re: How to merge datasets one pairwise and one not?

Another way:

``````data want(keep= ID1 ID2 Pair_Num Proportion_Shared Colors Age_Difference);
set have1;

length Colors \$ 50;

if _n_ = 1 then do;
if 0 then set have2;
declare hash h(dataset: 'have2');
h.defineKey('ID');
h.defineData('Color', 'Age');
h.defineDone();
call missing(Color, Age);
end;

rc = h.find(key: ID1);
Colors = Color;
_Age = Age;

rc = h.find(key: ID2);
Colors = catx(', ', Colors, Color);
Age_Difference = abs(_Age - Age);
run;``````
Super User

## Re: How to merge datasets one pairwise and one not?

And this is how we would have done it before the availability of hash objects:

``````data input1;
infile datalines dlm="09"x;
input ID1 ID2 Pair_Num Proportion_Shared;
datalines;
1	2	1	0.97
1	3	2	81
1	4	3	0.73
1	5	4	0.81
2	3	5	0.56
2	4	6	0.71
2	5	7	0.31
3	4	8	0.81
3	5	9	0.66
4	5	10	0.59
;

data input2;
infile datalines dlm="09"x;
input ID Color \$ Age;
datalines;
1	Blue	15
2	Green	21
3	Blue	78
4	Green	23
5	Blue	56
;

data cntlin;
fmtname = "colorfmt";
type = "N";
do until (done1);
set input2 end=done1;
start = ID;
label = color;
output;
end;
fmtname = "agefmt";
type = "I";
do until (done2);
set input2 end=done2;
start = ID;
label = left(put(age,best.));
output;
end;
keep fmtname type start label;
run;

proc format cntlin=cntlin;
run;

data want;
set input1;
colors = catx(",",put(id1,colorfmt.),put(id2,colorfmt.));
age_difference = abs(input(id1,agefmt.) - input(id2,agefmt.));
run;``````

Discussion stats
• 3 replies
• 561 views
• 1 like
• 4 in conversation