New SAS User

Completely new to SAS or trying something new with SAS? Post here for help getting started.
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
XY7
Fluorite | Level 6 XY7
Fluorite | Level 6

Hi, I just started picking up SAS recently and am working on an assignment. I encountered a problem when trying to merge two data sets together using a variable that appears in both data sets, the error says that the variable has been defined as both character and numeric, but when I checked both data sets, the variable is of character type in both. Hence, I am not sure what exactly could the problem be. Could anyone kindly explain? 

Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Welcome to the world of (almost) real data where folks tell you one thing about their data and you get to determine they weren't really sure of what they had.

Your choice is to decide which variable type you want to use going forward. If a variable is an identifier that you aren't going to do arithmetic with then likely the character version is what you want.

 

Here is a code template for fixing such issues with a single variable that needs to go from numeric to character:

data fixed;
   set have (rename=(account=accountnum));
   length account $ 5.;
   account= put(accountnum,f5. -L);
   drop accountnum;
run;

The rename in the data set option allows you to use the value of the old variable but create a new variable of the desired name. SAS will not let you change variable types directly. Many have tried and many have failed.

 

To minimize possibly confusion explicitly specify the length of the character variable before use.

The put function with the -L left justifies the string in the result, otherwise you are likely to get leading spaces.

Then drop the old value.

 

If you need to go from character to numeric then the function would be INPUT instead of Put and the Length statement is likely not needed. You would need an appropriate informat to match you value type.

View solution in original post

14 REPLIES 14
PeterClemmensen
Tourmaline | Level 20

Welcome to the SAS community 🙂 How have you checked the two data sets? 

XY7
Fluorite | Level 6 XY7
Fluorite | Level 6
Thank you! :)The details of the variables and type was provided.
Sajid01
Meteorite | Level 14

This happens if a variable is of character type in one data set and numeric in another.

Please have a uniform data type or considering converting both to the same type.

 

XY7
Fluorite | Level 6 XY7
Fluorite | Level 6
It's weird because all of the data sets that contains the same variable is of character type, thank you, I will try converting the type.
Sajid01
Meteorite | Level 14
Well I would suggest you use proc compare as mentioned by tom or use proc content . This would help you decide your next steps.
Make sure the are identical - type, length, format .
XY7
Fluorite | Level 6 XY7
Fluorite | Level 6
Alright, thank you!
Tom
Super User Tom
Super User

You have to check ALL of the common variables. 

An easy what is to use PROC COMPARE.  Use OBS=0 dataset options to prevent it from actually checking the data.

Let's setup up some example datasets with incompatible variables.

data test1 (rename=(name=var1)) test2 (rename=(age=var1));
  set sashelp.class;
run;

Now let's use PROC COMPARE to compare the two datasets. In this case they are named TEST1 and TEST2.

proc compare data=test1(obs=0) compare=test2(obs=0);
run;

Looking at the output you can see this summary of the mis-matched variables.

Data Set Summary

Dataset              Created          Modified  NVar    NObs

WORK.TEST1  06JAN19:12:24:39  06JAN19:12:24:39     5      19
WORK.TEST2  06JAN19:12:24:39  06JAN19:12:24:39     5      19


Variables Summary

Number of Variables in Common: 4.
Number of Variables in WORK.TEST1 but not in WORK.TEST2: 1.
Number of Variables in WORK.TEST2 but not in WORK.TEST1: 1.
Number of Variables with Conflicting Types: 1.


Listing of Common Variables with Conflicting Types

Variable  Dataset     Type  Length

var1      WORK.TEST1  Char       8
          WORK.TEST2  Num        8

XY7
Fluorite | Level 6 XY7
Fluorite | Level 6

Thank you, I have found out the problem but it's weird because it was detailed as a character type for both data sets in the data dictionary that was provided. In this case how should I do the conversion?  

 

  Variables Summary                                                          
                                                                                                                                    
                                   Number of Variables in Common: 1.                                                                
                                   Number of Variables in WORK.ACCOUNT but not in WORK.LOAN: 3.                                   
                                   Number of Variables in WORK.LOAN but not in WORK.ACCOUNT: 6.                                   
                                   Number of Variables with Conflicting Types: 1.                                                   

                                                         
                                                                                                                                    
                                         Listing of Common Variables with Conflicting Types                                         
                                                                                                                                    
                               Variable    Dataset        Type  Length  Format   Informat  Label                                    
                                                                                                                                    
                               account_id  WORK.ACCOUNT  Char       5  $CHAR5.  $CHAR5.                                            
                                           WORK.LOAN     Num        8  BEST.              account_id                               
 

 

 

Tom
Super User Tom
Super User

So the dataset LOAN was created incorrectly if the variable ACCOUNT_ID is supposed to be a character variable.

How did you create the dataset?  Can you fix that step?

It MIGHT be possible to convert the number back to a character string, but perhaps not. 

For example what if you have the number 11 what string should you convert it to?  Did the original value have leading zeros?  If so how many?

Also what if the original value for ACCOUNT_ID had a letter in it?  There is no way that got converted to a number, so the value is now lost.

XY7
Fluorite | Level 6 XY7
Fluorite | Level 6

 

 

 

 

 

 

 

 

 

The data set wasn't created by me, it was provided, it was purposely created this way to be solved. I didn't realise that it was created wrongly previously.  

 

This is the column in AccountThis is the column in AccountThis is the column in LoanThis is the column in Loan

ballardw
Super User

Welcome to the world of (almost) real data where folks tell you one thing about their data and you get to determine they weren't really sure of what they had.

Your choice is to decide which variable type you want to use going forward. If a variable is an identifier that you aren't going to do arithmetic with then likely the character version is what you want.

 

Here is a code template for fixing such issues with a single variable that needs to go from numeric to character:

data fixed;
   set have (rename=(account=accountnum));
   length account $ 5.;
   account= put(accountnum,f5. -L);
   drop accountnum;
run;

The rename in the data set option allows you to use the value of the old variable but create a new variable of the desired name. SAS will not let you change variable types directly. Many have tried and many have failed.

 

To minimize possibly confusion explicitly specify the length of the character variable before use.

The put function with the -L left justifies the string in the result, otherwise you are likely to get leading spaces.

Then drop the old value.

 

If you need to go from character to numeric then the function would be INPUT instead of Put and the Length statement is likely not needed. You would need an appropriate informat to match you value type.

XY7
Fluorite | Level 6 XY7
Fluorite | Level 6
Thank you so much!
ballardw
Super User

One additional bit about the Format used for the PUT function. If your character values have leading zero you may want to use a Z format instead of F. That would come into play generally with fixed length identifiers such as 00543 where all of the values are padded with leading zeroes to a given length. If you have intermittent leading 0 without an explicit rule regarding what values may require them then matching the ID variables may get to be a problem.

XY7
Fluorite | Level 6 XY7
Fluorite | Level 6
Ohh, thank you very much!

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 14 replies
  • 93888 views
  • 7 likes
  • 5 in conversation