Hello,
I am trying to merge three separate SAS datasets by two variables (ID Date) that are present in all three datasets. Two of three datasets successfully match merge. However, when I try to merge the third dataset in, instead of doing a match merge, it interleaves the datasets. I have checked the formatting and lengths of the variables in the three datasets using proc contents and they appear to be the same. I even tried to redefine the format, length, and label for each dataset to ensure they were identical and I am still not having luck. Any suggestions? Below is an example of my merge code. Dataset 1 and 2 (infect.weights and infect.temperature) contain four variables (ID, Group, Date,Weight/Temperature). Dataset 2 (infect.counts) contains nine variables (ID, Date, X1-X7).
DATA infect.weights;
SET infect.weights;
ATTRIB ID LENGTH=8 FORMAT=8. LABEL="ID";
ATTRIB DATE LENGTH=8 FORMAT=MMDDYY10. LABEL="Date";
RUN;
DATA infect.temperature;
SET infect.temperature;
ATTRIB ID LENGTH=8 FORMAT=8. LABEL="ID";
ATTRIB DATE LENGTH=8 FORMAT=MMDDYY10. LABEL="Date";
RUN;
DATA infect.cell;
SET infect.qscout;
ATTRIB ID LENGTH=8 FORMAT=8. LABEL="ID";
ATTRIB DATE LENGTH=8 FORMAT=MMDDYY10. LABEL="Date";
RUN;
DATA parameters;
MERGE infect.weights infect.temperature;
BY ID Date;
RUN;
DATA parameters2;
MERGE parameters infect.counts;
BY ID Date;
RUN;
/* I have also tried to perform the merge within one data step and get the same results */
My Results end up looking like this:
ID Group Date Weight Temp X1 X2 X3 X4 X5 X6 X7
1 1 01/01/2001 100 100 . . . . . . .
1 1 01/01/2001 . . 1 2 3 4 5 6 7
2 1 01/01/2001 125 101 . . . . . . .
2 1 01/01/2001 . . 8 9 10 11 12 13 14
Thanks in advance!
The result you show indicates that the internal stored value of either ID or Date (or possibly both) varies somehow. If the variables are character look for blank characters such as a null at the end or beginning of the value.
If numeric, check that there are not decimals in values that aren't displayed. Note that date values with decimals and assigned one of the SAS date formats will happily ignore the decimal portion for display but use the entire value for comparisons such as merge or logic tests.
See:
data example; date= '01Jan2018'd + 0.01; format date mmddyy10.; run; data example2; date= '01Jan2018'd ; format date mmddyy10.; run; data merged; merge example example2; by date; run;
I might try proc freq to display the ID or Date value using a BEST12. or similar to see if decimals are in your data. If so you might try rounding the values in a data step prior to the merge.
What does the log show?
@HHutch wrote:
Hello,
I am trying to merge three separate SAS datasets by two variables (ID Date) that are present in all three datasets. Two of three datasets successfully match merge. However, when I try to merge the third dataset in, instead of doing a match merge, it interleaves the datasets. I have checked the formatting and lengths of the variables in the three datasets using proc contents and they appear to be the same. I even tried to redefine the format, length, and label for each dataset to ensure they were identical and I am still not having luck. Any suggestions? Below is an example of my merge code. Dataset 1 and 2 (infect.weights and infect.temperature) contain four variables (ID, Group, Date,Weight/Temperature). Dataset 2 (infect.counts) contains nine variables (ID, Date, X1-X7).
DATA infect.weights; SET infect.weights; ATTRIB ID LENGTH=8 FORMAT=8. LABEL="ID"; ATTRIB DATE LENGTH=8 FORMAT=MMDDYY10. LABEL="Date"; RUN; DATA infect.temperature; SET infect.temperature; ATTRIB ID LENGTH=8 FORMAT=8. LABEL="ID"; ATTRIB DATE LENGTH=8 FORMAT=MMDDYY10. LABEL="Date"; RUN; DATA infect.cell; SET infect.qscout; ATTRIB ID LENGTH=8 FORMAT=8. LABEL="ID"; ATTRIB DATE LENGTH=8 FORMAT=MMDDYY10. LABEL="Date"; RUN; DATA parameters; MERGE infect.weights infect.temperature; BY ID Date; RUN; DATA parameters2; MERGE parameters infect.counts; BY ID Date; RUN; /* I have also tried to perform the merge within one data step and get the same results */
My Results end up looking like this:
ID Group Date Weight Temp X1 X2 X3 X4 X5 X6 X7 1 1 01/01/2001 100 100 . . . . . . . 1 1 01/01/2001 . . 1 2 3 4 5 6 7 2 1 01/01/2001 125 101 . . . . . . . 2 1 01/01/2001 . . 8 9 10 11 12 13 14
Thanks in advance!
The log shows
327 DATA parameters2;
328 MERGE parameters infect.qscout;
329 BY ID Date;
330 RUN;
NOTE: There were 375 observations read from the data set WORK.PARAMETERS.
NOTE: There were 233 observations read from the data set INFECT.QSCOUT.
NOTE: The data set WORK.PARAMETERS2 has 608 observations and 12 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
The result you show indicates that the internal stored value of either ID or Date (or possibly both) varies somehow. If the variables are character look for blank characters such as a null at the end or beginning of the value.
If numeric, check that there are not decimals in values that aren't displayed. Note that date values with decimals and assigned one of the SAS date formats will happily ignore the decimal portion for display but use the entire value for comparisons such as merge or logic tests.
See:
data example; date= '01Jan2018'd + 0.01; format date mmddyy10.; run; data example2; date= '01Jan2018'd ; format date mmddyy10.; run; data merged; merge example example2; by date; run;
I might try proc freq to display the ID or Date value using a BEST12. or similar to see if decimals are in your data. If so you might try rounding the values in a data step prior to the merge.
You were right! The one file's date contains values after the decimal point. I didn't realize the dataset it was being pulled from used a date/time format.
Thank you for your help! It is greatly appreciated!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.