I am trying to merge a number of datasets with certain information that is missing from the master data by matching their ID numbers. I am suing the merge statement but have a few problems. (I have sorted all my datasets by ID numbers). One of the variables is defined as both a character and a number so I changed that variable with the code below
data work.dat1;
set work.dat1;
quant=input(quant2, 10.);
drop quant;
run;
Additionally, when I run the merge statement it says that the variable race has multiple lengths in the data sets. So, I tried to specify the length before the merge statement but it doesn't seem to work and my results are basically showing up as blank for both the race and the converted new quant2 variable.
Here is the code for specifying the length of race variable:
data merged_set;
length race $32;
merge master_data dat1 dat2 dat3 dat4 dat5 dat6 dat7dat8 dat9 dat10;
by ID;
run;
Hard to say from what you have written. Variables showing up as missing could be caused by a lot of different things. The LENGTH warning has to be fixed by changing the length of variables in the input data sets, not in the output data set; and this may or may not be a cause of the missings.
The first debugging step I would follow is to take one particular ID and look at the values in all of the input data sets to see if you can figure out why the value in the resulting MERGED_SET winds up as missing.
Also, when you do a match MERGE here, if a variable exists in multiple data sets, you get the value of a variable from the right-most data set in the MERGE statement, in this case DAT10. If the value is missing for a variable in DAT10, it will show up as missing in MERGED_SET.
@Yughaber wrote:
The quant variable shows up as missing only after the error that tells me variable is both character and numeric in the data sets and I attempt to change that with the code above.
This is a clue. Perhaps you have changed the code incorrectly.
Also, I am checking with proc contents and the values are not missing in anything at all. It even says that race has a length of 40 and once I print my results I can see that only certain IDs have a match for race and others are left blank.
PROC CONTENTS does not tell you if values are missing. As I said earlier, you have to LOOK at the data sets yourself, with your own eyes, to see if there are missing in there. I'm afraid there's really no other path forward than for you to LOOK at the input data sets and see what is going on for one such ID where you are getting missings.
Maxim 2: Read the Log.
Post the complete log from this step (use the </> button), and we can explain what goes wrong.
The LOG is clearly and specifically telling you what is wrong.
Paying attention to details is an absolute prerequisite for successful SASing.
The complete log includes all code of the step, and you need to post the log using the </> button.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.