BookmarkSubscribeRSS Feed
hoisum
Calcite | Level 5

Hello,

 

I'm trying to concatenate 12 different data sets, and they all do not have the same amount of observations, nor do they all have the same variables. Still, I want to concatenate. I've read in the 12 separate datasets with no problem. 

 

Then, when I attempted to concatenate, via:

data concat;
   set DN1x DN2x DN3 DN4x DN5 DN6x DN7x DN8x DN9 DN10 DN11x DN12x;
run;

I encountered a problem in the log where it says: 

ERROR: Variable Height has been defined as both character and numeric.

 

Then, I tried to change the variables that were character into numeric in each dataset via:

data new;
set original; Height = 'NA'; Heightx = input(Height, 8.); drop height; rename Heightx=Height; run;

And then when I looked at the dataset, each of the data points in the variable height turned out to be all just "." instead of its numeric original data. I'm just trying to concatenate these 12 datasets together!

 

Please help! :"( 

 

4 REPLIES 4
Patrick
Opal | Level 21

Then, I tried to change the variables that were character into numeric in each dataset via:

data new;
set original; Height = 'NA'; Heightx = input(Height, 8.); drop height; rename Heightx=Height; run;

And then when I looked at the dataset, each of the data points in the variable height turned out to be all just "." instead of its numeric original data. I'm just trying to concatenate these 12 datasets together!

 

Please help! :"( 

 


 

Height = 'NA';

In your code you first assign string NA to the variable and only then try to convert the string to a number. As the source string is now always NA the conversion will always result in a missing.

 

Try below instead (only on datasets where height is character).

 

data new;
  set original;
  Heightx = input(Height, ?? best32.);
  drop height;
  rename Heightx=Height;
run;

 

 

 

 

hoisum
Calcite | Level 5

Hi! 

 

Thank you! I was able to successfully concatenate my 12 datasets. I've encountered another problem, however. Upon concatenating, one of the datasets did not have a variable column, lets say "comorbidities," like the other 11 datasets.

 

When I concatenated, the column where the variable did not initially exist for one of the datasets was just blank. I tried to make the variable and define it as "." in that dataset, but was unsuccessful. I used the following: 

 

data new;
set orig; 
if comorbidities=" " then comorbidities="."; 
run; 

... and then.... When I tried removing missing data by complete case analysis... 

 

data concat2; 
set concat;
if height=. then delete;
if alcohol=. then delete;
if comorbidites=. then delete;
run;

... the SAS log says the variable comorbidities is "uninitialized," which i don't quite understand what that's supposed to mean. 😞

ballardw
Super User

@hoisum wrote:

Hi! 

 

Thank you! I was able to successfully concatenate my 12 datasets. I've encountered another problem, however. Upon concatenating, one of the datasets did not have a variable column, lets say "comorbidities," like the other 11 datasets.

 

When I concatenated, the column where the variable did not initially exist for one of the datasets was just blank. I tried to make the variable and define it as "." in that dataset, but was unsuccessful. I used the following: 

 

data new;
set orig; 
if comorbidities=" " then comorbidities="."; 
run; 

... and then.... When I tried removing missing data by complete case analysis... 

 

data concat2; 
set concat;
if height=. then delete;
if alcohol=. then delete;
if comorbidites=. then delete;
run;

... the SAS log says the variable comorbidities is "uninitialized," which i don't quite understand what that's supposed to mean. 😞


Here I think you are confusing missing values. It appears that you comorbidities variable was character. So in the data set without the variable the values will be missing. And "." would be an actual value for a character variable. So likely you just need to leave it alone.

If you want to test if a character variable has a missing it is better to use the MISSING function. It works for both character and numeric variables. For example:

 

If missing(cormorbidities) then delete:

 

The comment about: comorbidities is "uninitialized," you should post the entire data step from the log with the message. I suspect a spelling issue. As your code as shown lists

if comorbidites=. then delete; 

and suspect that your question text was autocorrected by the forum to comorbidities.

 

 

SASKiwi
PROC Star

Your statement -Height = 'NA'; - is overwriting any data being read from the ORIGINAL dataset. Try removing this statement. 

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1006 views
  • 0 likes
  • 4 in conversation