BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Stalk
Pyrite | Level 9

What is the best statement SET or Merge to get the desired PE output. I need to join the tables by PID and retain only the records from table P. Also need to have a field called source in PE to identify if the record is in both or P.

Thank you

Table P     Table E  
PID SID1   PID SID2
1 123   1 124
2 24   2 24
3 43   4 33
5 26   5 23
      6 15
Table PE        
PID SID1 SID2    
1 123 124    
2 24 24    
3 43      
5 26 23    
1 ACCEPTED SOLUTION

Accepted Solutions
Stalk
Pyrite | Level 9
I did the exact same thing. But I am getting Cartesian product.
Suppose my Table P has 17K records and table E has 16500 records. My merged dataset has 20684 records, even after keeping the records from P.
I want my output to have 17K records with SID from both tables. How?

View solution in original post

4 REPLIES 4
ballardw
Super User

@Stalk wrote:

What is the best statement SET or Merge to get the desired PE output. I need to join the tables by PID and retain only the records from table P. Also need to have a field called source in PE to identify if the record is in both or P.

Thank you

Table P     Table E  
PID SID1   PID SID2
1 123   1 124
2 24   2 24
3 43   4 33
5 26   5 23
      6 15
Table PE        
PID SID1 SID2    
1 123 124    
2 24 24    
3 43      
5 26 23    

When you want to MATCH records from two (or more) data sets based on the value of a common variable then you want a Merge with a By;

data p;
  input Pid sid1;
datalines;
1 1
2 2
;

data e;
  input enrolid sid2;
datalines;
1  11
3  33
;

data merged;
  merge p (in=in1)
        e (in=in2)
  ;
  /* the first IF keeps records that match the id field
     in the first set
  */
  if in1; 
  If in1 and in2 then source='Both';
  else if in1 then source='P';
run;

The data set option IN= creates a temporary variable valued 1 or 0 if the data set contributes to the current record. SAS uses 1 for true and 0 for false so can be used in If as above.

Stalk
Pyrite | Level 9
I did the exact same thing. But I am getting Cartesian product.
Suppose my Table P has 17K records and table E has 16500 records. My merged dataset has 20684 records, even after keeping the records from P.
I want my output to have 17K records with SID from both tables. How?
Stalk
Pyrite | Level 9
Actually never mind. I know my mistake now. I merged by PID.
AndreaVianello
Obsidian | Level 7

what is the best - merge or set?

 NO, SQL Left join !

data p;
input Pid sid1;
datalines;
1 123
2 24
3 43
5 26
;

data e;
input pid sid2;
datalines;
1 124
2 24
4 33
5 23
6 15
;
run;

proc sql;
create table PE as
select p.pid, sid1, sid2
from p left join e
on p.pid=e.pid;
quit;

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1262 views
  • 1 like
  • 3 in conversation