BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Stalk
Pyrite | Level 9

What is the best statement SET or Merge to get the desired PE output. I need to join the tables by PID and retain only the records from table P. Also need to have a field called source in PE to identify if the record is in both or P.

Thank you

Table P     Table E  
PID SID1   PID SID2
1 123   1 124
2 24   2 24
3 43   4 33
5 26   5 23
      6 15
Table PE        
PID SID1 SID2    
1 123 124    
2 24 24    
3 43      
5 26 23    
1 ACCEPTED SOLUTION

Accepted Solutions
Stalk
Pyrite | Level 9
I did the exact same thing. But I am getting Cartesian product.
Suppose my Table P has 17K records and table E has 16500 records. My merged dataset has 20684 records, even after keeping the records from P.
I want my output to have 17K records with SID from both tables. How?

View solution in original post

4 REPLIES 4
ballardw
Super User

@Stalk wrote:

What is the best statement SET or Merge to get the desired PE output. I need to join the tables by PID and retain only the records from table P. Also need to have a field called source in PE to identify if the record is in both or P.

Thank you

Table P     Table E  
PID SID1   PID SID2
1 123   1 124
2 24   2 24
3 43   4 33
5 26   5 23
      6 15
Table PE        
PID SID1 SID2    
1 123 124    
2 24 24    
3 43      
5 26 23    

When you want to MATCH records from two (or more) data sets based on the value of a common variable then you want a Merge with a By;

data p;
  input Pid sid1;
datalines;
1 1
2 2
;

data e;
  input enrolid sid2;
datalines;
1  11
3  33
;

data merged;
  merge p (in=in1)
        e (in=in2)
  ;
  /* the first IF keeps records that match the id field
     in the first set
  */
  if in1; 
  If in1 and in2 then source='Both';
  else if in1 then source='P';
run;

The data set option IN= creates a temporary variable valued 1 or 0 if the data set contributes to the current record. SAS uses 1 for true and 0 for false so can be used in If as above.

Stalk
Pyrite | Level 9
I did the exact same thing. But I am getting Cartesian product.
Suppose my Table P has 17K records and table E has 16500 records. My merged dataset has 20684 records, even after keeping the records from P.
I want my output to have 17K records with SID from both tables. How?
Stalk
Pyrite | Level 9
Actually never mind. I know my mistake now. I merged by PID.
AndreaVianello
Obsidian | Level 7

what is the best - merge or set?

 NO, SQL Left join !

data p;
input Pid sid1;
datalines;
1 123
2 24
3 43
5 26
;

data e;
input pid sid2;
datalines;
1 124
2 24
4 33
5 23
6 15
;
run;

proc sql;
create table PE as
select p.pid, sid1, sid2
from p left join e
on p.pid=e.pid;
quit;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 455 views
  • 1 like
  • 3 in conversation