I have two tables
TableA
key | var1 | var2 | var3 | count1 | count2 |
123456789 | usa | hi | 96713 | 1 | 1 |
784569102 | usa | ny | 11108 | 1 | |
635472134 | usa | dc | 78456 | 1 | 1 |
TableB
key | Key2 | var4 |
123456789 | 1 | Processed |
123456789 | 2 | NP |
784569102 | 1 | Failed |
635472134 | 1 | Processed |
635472134 | 2 | Failed |
I am trying to merge them using the following code
PROC SORT DATA=TABLEA;
BY KEY;
RUN;
PROC SORT DATA=TABLEB;
BY KEY;
RUN;
DATA COMBINED;
MERGE TABLEA(IN=A) TABLEB(IN=B);
BY KEY;
IF A AND B;
RUN;
The output looks like this : -
key | var1 | var2 | var3 | count1 | count2 | key | Key2 | var4 |
123456789 | usa | hi | 96713 | 1 | 1 | 123456789 | 1 | Processed |
123456789 | usa | hi | 96713 | 1 | 1 | 123456789 | 2 | NP |
784569102 | usa | ny | 11108 | 1 | 784569102 | 1 | Failed | |
635472134 | usa | dc | 78456 | 1 | 1 | 635472134 | 1 | Processed |
635472134 | usa | dc | 78456 | 1 | 1 | 635472134 | 2 | Failed |
I do not want my TableA colums to be duplicated in the result and only the TableB colums. The reason is I will be using count1 and count2 in later step to summarize. so effectively I want the resultant table to look like this
key | var1 | var2 | var3 | count1 | count2 | key | Key2 | var4 |
123456789 | usa | hi | 96713 | 1 | 1 | 123456789 | 1 | Processed |
123456789 | 2 | NP | ||||||
784569102 | usa | ny | 11108 | 1 | 784569102 | 1 | Failed | |
635472134 | usa | dc | 78456 | 1 | 1 | 635472134 | 1 | Processed |
635472134 | 2 | Failed |
I can do this by looking at the duplicates at a later step and make the TableA columns to missing (call missing). Is there a better approach to arrive this solution.
Thanks !
It looks like you want a join, but then a report procedure to display the data? Storing data in that manner is a bit weird, but it’s very common for reports not to duplicate data. I believe both PROC REPORT or PRINT have methods/options to prevent duplicates.
1. TableA unique by key and TableB has duplicates
2. What functionality in PROC REPORT and PRINT would allow me to summarize without including duplicates.
Here's the general idea. You may want to fiddle with it of course:
proc sort data=combined;
by key var1 var2 var3 count1 count2;
run;
proc print data=combined;
by key var1 var2 var3 count1 count2;
id key var1 var2 var3 count1 count2;
var key2 var4;
run;
If you actually want KEY to print twice, you have to switch to PROC REPORT. For PROC PRINT to print KEY twice you would need a DATA step to copy KEY into a second variable and print both variables.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.