BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Stalk
Pyrite | Level 9

What is the best statement SET or Merge to get the desired PE output. I need to join the tables by PID and retain only the records from table P. Also need to have a field called source in PE to identify if the record is in both or P.

Thank you

Table P     Table E  
PID SID1   PID SID2
1 123   1 124
2 24   2 24
3 43   4 33
5 26   5 23
      6 15
Table PE        
PID SID1 SID2    
1 123 124    
2 24 24    
3 43      
5 26 23    
1 ACCEPTED SOLUTION

Accepted Solutions
Stalk
Pyrite | Level 9
I did the exact same thing. But I am getting Cartesian product.
Suppose my Table P has 17K records and table E has 16500 records. My merged dataset has 20684 records, even after keeping the records from P.
I want my output to have 17K records with SID from both tables. How?

View solution in original post

4 REPLIES 4
ballardw
Super User

@Stalk wrote:

What is the best statement SET or Merge to get the desired PE output. I need to join the tables by PID and retain only the records from table P. Also need to have a field called source in PE to identify if the record is in both or P.

Thank you

Table P     Table E  
PID SID1   PID SID2
1 123   1 124
2 24   2 24
3 43   4 33
5 26   5 23
      6 15
Table PE        
PID SID1 SID2    
1 123 124    
2 24 24    
3 43      
5 26 23    

When you want to MATCH records from two (or more) data sets based on the value of a common variable then you want a Merge with a By;

data p;
  input Pid sid1;
datalines;
1 1
2 2
;

data e;
  input enrolid sid2;
datalines;
1  11
3  33
;

data merged;
  merge p (in=in1)
        e (in=in2)
  ;
  /* the first IF keeps records that match the id field
     in the first set
  */
  if in1; 
  If in1 and in2 then source='Both';
  else if in1 then source='P';
run;

The data set option IN= creates a temporary variable valued 1 or 0 if the data set contributes to the current record. SAS uses 1 for true and 0 for false so can be used in If as above.

Stalk
Pyrite | Level 9
I did the exact same thing. But I am getting Cartesian product.
Suppose my Table P has 17K records and table E has 16500 records. My merged dataset has 20684 records, even after keeping the records from P.
I want my output to have 17K records with SID from both tables. How?
Stalk
Pyrite | Level 9
Actually never mind. I know my mistake now. I merged by PID.
AndreaVianello
Obsidian | Level 7

what is the best - merge or set?

 NO, SQL Left join !

data p;
input Pid sid1;
datalines;
1 123
2 24
3 43
5 26
;

data e;
input pid sid2;
datalines;
1 124
2 24
4 33
5 23
6 15
;
run;

proc sql;
create table PE as
select p.pid, sid1, sid2
from p left join e
on p.pid=e.pid;
quit;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 659 views
  • 1 like
  • 3 in conversation