BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SAS49
Obsidian | Level 7

Hi all,

 

I am very new to SAS, and I am trying to merge two datasets.  The first has pairwise comparisons between the proportion shared between 2 IDs.  The second has additional identifying information about each of these IDs.  I want the merge the two datasets to produce an output similar to what is shown below, but I am unsure where to start.  I have provided samples of the data tables below.  I am working in SAS 9.4.  I appreciate any help you can provide. 

 

Input 1   
ID1ID2Pair_NumProportion_Shared
1210.97
13281
1430.73
1540.81
2350.56
2460.71
2570.31
3480.81
3590.66
45100.59

 

Input 2  
IDColorAge
1Blue15
2Green21
3Blue78
4Green23
5Blue56

 

Desired Output     
ID1ID2Pair_NumProportion_SharedColorsAge_Difference 
1210.97Blue, Green6
13281Blue, Blue63
1430.73Blue, Green8
1540.81Blue, Blue41
2350.56Green, Blue57
2460.71Green, Green2
2570.31Green, Blue35
3480.81Blue, Green55
3590.66Blue, Blue22
45100.59Green, Blue33
1 ACCEPTED SOLUTION

Accepted Solutions
andreas_lds
Jade | Level 19

Another way:

data want(keep= ID1 ID2 Pair_Num Proportion_Shared Colors Age_Difference);
   set have1;
   
   length Colors $ 50;   
   
   if _n_ = 1 then do;
      if 0 then set have2;
      declare hash h(dataset: 'have2');
      h.defineKey('ID');
      h.defineData('Color', 'Age');
      h.defineDone();
      call missing(Color, Age);
   end;
   
   rc = h.find(key: ID1);
   Colors = Color;
   _Age = Age;

   rc = h.find(key: ID2);
   Colors = catx(', ', Colors, Color);
   Age_Difference = abs(_Age - Age);
run;

View solution in original post

3 REPLIES 3
Tom
Super User Tom
Super User

So you have a PAIRS dataset with these variables:

ID1 ID2 Pair_Num Proportion_Shared

 

And and IDS dataset with these variables:

ID Color Age

 

And you want to make new dataset with these variables.

ID1 ID2 Pair_Num Proportion_Shared Colors Age_Difference

 

So a simple SQL join should do the job.

proc sql;
  create table WANT as
  select a.*
       , catx(',',b.color,c.color) as Colors
       , b.age - c.age as Age_Difference
  from PAIRS A
  left join IDS B on a.id1=b.id
  left join IDS C on a.id2=c.id
  ;
quit;
andreas_lds
Jade | Level 19

Another way:

data want(keep= ID1 ID2 Pair_Num Proportion_Shared Colors Age_Difference);
   set have1;
   
   length Colors $ 50;   
   
   if _n_ = 1 then do;
      if 0 then set have2;
      declare hash h(dataset: 'have2');
      h.defineKey('ID');
      h.defineData('Color', 'Age');
      h.defineDone();
      call missing(Color, Age);
   end;
   
   rc = h.find(key: ID1);
   Colors = Color;
   _Age = Age;

   rc = h.find(key: ID2);
   Colors = catx(', ', Colors, Color);
   Age_Difference = abs(_Age - Age);
run;
Kurt_Bremser
Super User

And this is how we would have done it before the availability of hash objects:

data input1;
infile datalines dlm="09"x;
input ID1 ID2 Pair_Num Proportion_Shared;
datalines;
1	2	1	0.97
1	3	2	81
1	4	3	0.73
1	5	4	0.81
2	3	5	0.56
2	4	6	0.71
2	5	7	0.31
3	4	8	0.81
3	5	9	0.66
4	5	10	0.59
;

data input2;
infile datalines dlm="09"x;
input ID Color $ Age;
datalines;
1	Blue	15
2	Green	21
3	Blue	78
4	Green	23
5	Blue	56
;

data cntlin;
fmtname = "colorfmt";
type = "N";
do until (done1);
  set input2 end=done1;
  start = ID;
  label = color;
  output;
end;
fmtname = "agefmt";
type = "I";
do until (done2);
  set input2 end=done2;
  start = ID;
  label = left(put(age,best.));
  output;
end;
keep fmtname type start label;
run;

proc format cntlin=cntlin;
run;

data want;
set input1;
colors = catx(",",put(id1,colorfmt.),put(id2,colorfmt.));
age_difference = abs(input(id1,agefmt.) - input(id2,agefmt.));
run;

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 539 views
  • 1 like
  • 4 in conversation