BookmarkSubscribeRSS Feed
Yughaber
Quartz | Level 8

I am trying to merge a number of datasets with certain information that is missing from the master data by matching their ID numbers. I am suing the merge statement but have a few problems. (I have sorted all my datasets by ID numbers). One of the variables is defined as both a character and a number so I changed that variable with the code below

 

data work.dat1;

set work.dat1;

quant=input(quant2, 10.);

drop quant;

run;

 

Additionally, when I run the merge statement it says that the variable race has multiple lengths in the data sets. So, I tried to specify the length before the merge statement but it doesn't seem to work and my results are basically showing up as blank for both the race and the converted new quant2 variable.

 

Here is the code for specifying the length of race variable: 

 

data merged_set;

length race $32;

merge master_data dat1 dat2 dat3 dat4 dat5 dat6 dat7dat8 dat9 dat10;

by ID;

run;

8 REPLIES 8
PaigeMiller
Diamond | Level 26

Hard to say from what you have written. Variables showing up as missing could be caused by a lot of different things. The LENGTH warning has to be fixed by changing the length of variables in the input data sets, not in the output data set; and this may or may not be a cause of the missings.

 

The first debugging step I would follow is to take one particular ID and look at the values in all of the input data sets to see if you can figure out why the value in the resulting MERGED_SET winds up as missing.


Also, when you do a match MERGE here, if a variable exists in multiple data sets, you get the value of a variable from the right-most data set in the MERGE statement, in this case DAT10. If the value is missing for a variable in DAT10, it will show up as missing in MERGED_SET.

--
Paige Miller
Yughaber
Quartz | Level 8
The quant variable shows up as missing only after the error that tells me variable is both character and numeric in the data sets and I attempt to change that with the code above.

Also, I am checking with proc contents and the values are not missing in anything at all. It even says that race has a length of 40 and once I print my results I can see that only certain IDs have a match for race and others are left blank
PaigeMiller
Diamond | Level 26

@Yughaber wrote:
The quant variable shows up as missing only after the error that tells me variable is both character and numeric in the data sets and I attempt to change that with the code above.

This is a clue. Perhaps you have changed the code incorrectly.

 

Also, I am checking with proc contents and the values are not missing in anything at all. It even says that race has a length of 40 and once I print my results I can see that only certain IDs have a match for race and others are left blank.

PROC CONTENTS does not tell you if values are missing. As I said earlier, you have to LOOK at the data sets yourself, with your own eyes, to see if there are missing in there. I'm afraid there's really no other path forward than for you to LOOK at the input data sets and see what is going on for one such ID where you are getting missings.

--
Paige Miller
Yughaber
Quartz | Level 8
hm this is the code I am using
data work.dat1;
set work.dat1;
quant=input(quant2, 10.);
drop quant;
run;
Is it maybe because I am dropping quant itself? but then it should have quant2 as the new variable

I actually also looked at the data and everything is there!
Yughaber
Quartz | Level 8
the log doesn't throw any type of warning or error for this

NOTE: Numeric values have been converted to character values at the places given by:
(Line):(Column).
NOTE: Variable quant is uninitialized.
NOTE: There were 56350 observations read from the data set work.dat1.
NOTE: The data set work.dat1 has 56350 observations and 6 variables.
PaigeMiller
Diamond | Level 26

The LOG is clearly and specifically telling you what is wrong.

--
Paige Miller
Kurt_Bremser
Super User

Paying attention to details is an absolute prerequisite for successful SASing.

The complete log includes all code of the step, and you need to post the log using the </> button.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1305 views
  • 2 likes
  • 3 in conversation