Variable is defined as both character and numeric when defining 3 datasets by user id using the set statement

Accepted Solution Solved
Reply
New Contributor BK1
New Contributor
Posts: 2
Accepted Solution

Variable is defined as both character and numeric when defining 3 datasets by user id using the set statement

Hi All:

New to this forum, but I am hoping can get a reply to my question fairly quickly. I have three datasets, pilot, full and recent. When I imported my datasets into SAS I had to remove the second row of the excel file, because the labels were in the 2nd row and they were messing up some code I was running. So, now I just have the variable names, fine. Anyhow, in trying to combine the datasets, it appears some variables are defined as both character and numeric (numeric in 2 of the datasets, character in one) so its not allowing me to combine the datasets into a single file using the set statement.

I used this code in the hopes of standardizing the variable formats :

data combined;

set pilot full recent;

VAR20=VAR20*1;

VAR21=VAR21*1;

/*etc*/

run;

but its continuing with the error. The log is below. Any help would be appreciated! Thank you.

23338  data combined;

23339  set pilot full recent;
ERROR: Variable VAR20 has been defined as both character and numeric.

ERROR: Variable VAR21 has been defined as both character and numeric.

ERROR: Variable VAR22 has been defined as both character and numeric.

ERROR: Variable VAR23 has been defined as both character and numeric.

ERROR: Variable VAR24 has been defined as both character and numeric.

ERROR: Variable VAR25 has been defined as both character and numeric.

ERROR: Variable VAR26 has been defined as both character and numeric.

ERROR: Variable VAR236 has been defined as both character and numeric.

ERROR: Variable VAR20 has been defined as both character and numeric.

ERROR: Variable VAR21 has been defined as both character and numeric.

ERROR: Variable VAR22 has been defined as both character and numeric.

ERROR: Variable VAR23 has been defined as both character and numeric.

ERROR: Variable VAR24 has been defined as both character and numeric.

ERROR: Variable VAR25 has been defined as both character and numeric.

ERROR: Variable VAR26 has been defined as both character and numeric.

ERROR: Variable VAR28 has been defined as both character and numeric.

ERROR: Variable VAR29 has been defined as both character and numeric.

ERROR: Variable VAR30 has been defined as both character and numeric.

ERROR: Variable VAR31 has been defined as both character and numeric.

ERROR: Variable VAR32 has been defined as both character and numeric.

ERROR: Variable VAR33 has been defined as both character and numeric.

ERROR: Variable VAR34 has been defined as both character and numeric.

ERROR: Variable VAR244 has been defined as both character and numeric.

23340  VAR20=VAR20*1;

23341  VAR21=VAR21*1;

23342  VAR22=VAR22*1;

23343  VAR23=VAR23*1;

23344  VAR24=VAR24*1;

23345  VAR25=VAR25*1;

23346  VAR26=VAR26*1;

23347  VAR28=VAR28*1;

23348  VAR29=VAR29*1;

23349  VAR30=VAR30*1;

23350  VAR31=VAR31*1;

23351  VAR32=VAR32*1;

23352  VAR33=VAR33*1;

23353  VAR34=VAR34*1;

23354  VAR236=VAR236*1;

23355  VAR244=VAR244*1;

23356  by UserId;

23357  run;



NOTE: Character values have been converted to numeric values at the places given by:


      (Line)Smiley SadColumn).


      23354:8   23355:8


NOTE: Numeric values have been converted to character values at the places given by:


      (Line)Smiley SadColumn).


      23354:14   23355:14


NOTE: The SAS System stopped processing this step because of errors.


WARNING: The data set WORK.COMBINED may be incomplete.  When this step was stopped there were 0


         observations and 316 variables.


WARNING: Data set WORK.COMBINED was not replaced because this step was stopped.


NOTE: DATA statement used (Total process time):


      real time           0.01 seconds


      cpu time            0.01 seconds


Accepted Solutions
Solution
‎08-09-2014 02:50 PM
Super User
Super User
Posts: 6,500

Re: Variable is defined as both character and numeric when defining 3 datasets by user id using the set statement

This is a problem with using EXCEL to store your data.  It is a spreadsheet and not a database and so any cell could have any type of data.  Thus SAS (or more likely the Jet Engine from Microsoft that it calls) tries to make a guess as to the variable types based on the data that it sees.

First try saving the files as .XLSX files and import those.

Otherwise I find it much easier to save the files as CSV files and then input them using a SAS data step.  That way I can define the variable types.

Since you are just putting them together anyway and you already took out the column headers you could probably read all three in one data step.

filename all ('pilot.csv','full.csv','recent.csv');

data combined ;

  infile all dsd truncover lrecl=30000 ;

  input var1-v316 ;

run;

If you need to define some of them as character then add a LENGTH statement before the INPUT.  For example if VAR1 is character ID variable then you might add LENGTH VAR1 $10;

If some are dates or times then add INFORMAT and FORMAT statements for those variables.

View solution in original post


All Replies
Solution
‎08-09-2014 02:50 PM
Super User
Super User
Posts: 6,500

Re: Variable is defined as both character and numeric when defining 3 datasets by user id using the set statement

This is a problem with using EXCEL to store your data.  It is a spreadsheet and not a database and so any cell could have any type of data.  Thus SAS (or more likely the Jet Engine from Microsoft that it calls) tries to make a guess as to the variable types based on the data that it sees.

First try saving the files as .XLSX files and import those.

Otherwise I find it much easier to save the files as CSV files and then input them using a SAS data step.  That way I can define the variable types.

Since you are just putting them together anyway and you already took out the column headers you could probably read all three in one data step.

filename all ('pilot.csv','full.csv','recent.csv');

data combined ;

  infile all dsd truncover lrecl=30000 ;

  input var1-v316 ;

run;

If you need to define some of them as character then add a LENGTH statement before the INPUT.  For example if VAR1 is character ID variable then you might add LENGTH VAR1 $10;

If some are dates or times then add INFORMAT and FORMAT statements for those variables.

Respected Advisor
Posts: 4,646

Re: Variable is defined as both character and numeric when defining 3 datasets by user id using the set statement

Importing data from Excel worksheets into a database structure such as SAS is a thorny issue. The problem is that the data type of each column is determined by Excel and it only scans the first 8 lines of data to do so. You can change that behaviour (you only have to do this once) by modifying the typeGuessRows entry to zero in the Windows registry. When typeGuessRows=0, all data lines are scanned to determine column data types. The procedure is described here:

SAS/ACCESS(R) 9.3 Interface to PC Files: Reference

PG

PG
Contributor
Posts: 39

Re: Variable is defined as both character and numeric when defining 3 datasets by user id using the

I know it sounds something else but where is the windows registry ? Thank you

 

New Contributor BK1
New Contributor
Posts: 2

Re: Variable is defined as both character and numeric when defining 3 datasets by user id using the set statement

Thank you, your code did the trick!

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 6764 views
  • 3 likes
  • 4 in conversation