BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
sayno2s
Fluorite | Level 6

Hi guys,

 

I want to join two tables but I'm having problems with the "lack of detail" in one variable which results in either too many or too few records in the output.

 

The issue is "VISIT" in the first data set. There are four "U" visits in "Prot1" and two in "Prot2". Joining the tables, I'm getting, e.g., six or two "U" visits in each protocol, depending on the code version (see below).

 

I've been trying things with in/excluding variables, different types of joins, with/out "distinct", "Group By", pulling out my hair, but nothing works.

I'd be very grateful for your help and thank you in advance!

This is how the two tables and the desired output look like:
Q-Tables.PNG

Here's the code example:

/* This is just an example for one "Num"; there are several hundreds in the tables */
data dset1; input Num VISIT $ Type $ Prot $ Result $; datalines; 100 M4 Type2 Prot1 Outcome1 100 M4 Type1 Prot1 Outcome2 100 W12 Type2 Prot2 Outcome5 100 W12 Type1 Prot2 Outcome5 100 W6 Type2 Prot2 Outcome6 100 W6 Type1 Prot2 Outcome5 100 U Type2 Prot1 Outcome3 100 U Type1 Prot1 Outcome3 100 U Type2 Prot1 Outcome3 100 U Type1 Prot1 Outcome3 100 U Type2 Prot2 Outcome5 100 U Type1 Prot2 Outcome4 ; run data dset2; input Num VisitName $ Type $ Reader $ OrigVisitName $ InternalID $; datalines; 100 M4 Type1 PersonA Month4 ID1 100 M4 Type2 PersonB Month4 ID1 100 W12 Type1 PersonA Week12 ID2 100 W12 Type2 PersonB Week12 ID2 100 W6 Type1 PersonA Week6 ID3 100 W6 Type2 PersonB Week6 ID3 100 U Type1 PersonA U1 ID4 100 U Type1 PersonA U2 ID6 100 U Type1 PersonA U3 ID5 100 U Type2 PersonB U1 ID4 100 U Type2 PersonB U2 ID6 100 U Type2 PersonB U3 ID5 ; run proc sql; create table work.WANT as select distinct t1.*, t2.VisitName, t2.Reader /*, t2.OrigVisitName, t2.InternalID */ from work.dset1 t1 full join work.dset2 t2 on (t1.Num = t2.Num) and (t1.VISIT = t2.VisitName) and (t1.Type = t2.Type); quit;
1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

Try next code. My result may be in different order then yours.

data dset1;
input Num VISIT $ Type $ Prot $ Result $;
datalines;
100 M4 Type2 Prot1 Outcome1
100 M4 Type1 Prot1 Outcome2
100 W12 Type2 Prot2 Outcome5
100 W12 Type1 Prot2 Outcome5
100 W6 Type2 Prot2 Outcome6
100 W6 Type1 Prot2 Outcome5
100 U Type2 Prot1 Outcome3
100 U Type1 Prot1 Outcome3
100 U Type2 Prot1 Outcome3
100 U Type1 Prot1 Outcome3
100 U Type2 Prot2 Outcome5
100 U Type1 Prot2 Outcome4
;
run;

data dset2;
input Num VisitName $ Type $ Reader $ OrigVisitName $ InternalID $;
datalines;
100 M4 Type1 PersonA Month4 ID1
100 M4 Type2 PersonB Month4 ID1
100 W12 Type1 PersonA Week12 ID2
100 W12 Type2 PersonB Week12 ID2
100 W6 Type1 PersonA Week6 ID3
100 W6 Type2 PersonB Week6 ID3
100 U Type1 PersonA U1 ID4
100 U Type1 PersonA U2 ID6
100 U Type1 PersonA U3 ID5
100 U Type2 PersonB U1 ID4
100 U Type2 PersonB U2 ID6
100 U Type2 PersonB U3 ID5
;
run;

proc sort data=dset1; by num visit type; run;
proc sort data=dset2; by num visitname type; run;

data temp;
 merge dset1 
       dset2(rename=(visitname=visit));
  by num visit type;
run;

View solution in original post

4 REPLIES 4
ballardw
Super User

Rules are helpful.

 

Your example data steps do not look like the pictures, the order is different. So without some pretty explicit rules I do not quite know what you are actually expecting.

sayno2s
Fluorite | Level 6

Thank you for taking a look!

Sorry, I had sorted the excel data somewhere in the meantime. The datalines below are now corresponding to the screenshot.

 

I hope this next screenshot also helps. The issue is that the "U" visits in dset1 are not distinct (this information is available in "OrigVisitName" in dset2). Now I want join the tables but I can't get it to work. It's either too few U visits because of incorporating fewer variables (Wrong Table 2) or there are too many because everyone matches with everyone (Wrong Table 2).


Q-Tables2.PNG

 

data dset1;
input Num VISIT $ Type $ Prot $ Result $;
datalines;
Num VISIT Type Prot Result
100 M4 Type1 Prot1 Outcome2
100 M4 Type2 Prot1 Outcome1
100 U Type1 Prot1 Outcome3
100 U Type1 Prot1 Outcome3
100 U Type1 Prot2 Outcome4
100 U Type2 Prot1 Outcome3
100 U Type2 Prot1 Outcome3
100 U Type2 Prot2 Outcome5
100 W12 Type1 Prot2 Outcome5
100 W12 Type2 Prot2 Outcome5
100 W6 Type1 Prot2 Outcome5
100 W6 Type2 Prot2 Outcome6
;
run

data dset2;
input Num VisitName $ Type $ Reader $ OrigVisitName $ InternalID $;
datalines;
100 M4 Type1 PersonA Month4 ID1
100 M4 Type2 PersonB Month4 ID1
100 U Type1 PersonA U1 ID4
100 U Type1 PersonA U2 ID6
100 U Type1 PersonA U3 ID5
100 U Type2 PersonB U1 ID4
100 U Type2 PersonB U2 ID6
100 U Type2 PersonB U3 ID5
100 W12 Type1 PersonA Week12 ID2
100 W12 Type2 PersonB Week12 ID2
100 W6 Type1 PersonA Week6 ID3
100 W6 Type2 PersonB Week6 ID3
;
run 
Shmuel
Garnet | Level 18

Try next code. My result may be in different order then yours.

data dset1;
input Num VISIT $ Type $ Prot $ Result $;
datalines;
100 M4 Type2 Prot1 Outcome1
100 M4 Type1 Prot1 Outcome2
100 W12 Type2 Prot2 Outcome5
100 W12 Type1 Prot2 Outcome5
100 W6 Type2 Prot2 Outcome6
100 W6 Type1 Prot2 Outcome5
100 U Type2 Prot1 Outcome3
100 U Type1 Prot1 Outcome3
100 U Type2 Prot1 Outcome3
100 U Type1 Prot1 Outcome3
100 U Type2 Prot2 Outcome5
100 U Type1 Prot2 Outcome4
;
run;

data dset2;
input Num VisitName $ Type $ Reader $ OrigVisitName $ InternalID $;
datalines;
100 M4 Type1 PersonA Month4 ID1
100 M4 Type2 PersonB Month4 ID1
100 W12 Type1 PersonA Week12 ID2
100 W12 Type2 PersonB Week12 ID2
100 W6 Type1 PersonA Week6 ID3
100 W6 Type2 PersonB Week6 ID3
100 U Type1 PersonA U1 ID4
100 U Type1 PersonA U2 ID6
100 U Type1 PersonA U3 ID5
100 U Type2 PersonB U1 ID4
100 U Type2 PersonB U2 ID6
100 U Type2 PersonB U3 ID5
;
run;

proc sort data=dset1; by num visit type; run;
proc sort data=dset2; by num visitname type; run;

data temp;
 merge dset1 
       dset2(rename=(visitname=visit));
  by num visit type;
run;
sayno2s
Fluorite | Level 6

I could swear I tried that...but apparently not as this works!

 

Thank you so much!

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 788 views
  • 1 like
  • 3 in conversation