BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
BK1
Calcite | Level 5 BK1
Calcite | Level 5

Hi All:

New to this forum, but I am hoping can get a reply to my question fairly quickly. I have three datasets, pilot, full and recent. When I imported my datasets into SAS I had to remove the second row of the excel file, because the labels were in the 2nd row and they were messing up some code I was running. So, now I just have the variable names, fine. Anyhow, in trying to combine the datasets, it appears some variables are defined as both character and numeric (numeric in 2 of the datasets, character in one) so its not allowing me to combine the datasets into a single file using the set statement.

I used this code in the hopes of standardizing the variable formats :

data combined;

set pilot full recent;

VAR20=VAR20*1;

VAR21=VAR21*1;

/*etc*/

run;

but its continuing with the error. The log is below. Any help would be appreciated! Thank you.

23338  data combined;

23339  set pilot full recent;
ERROR: Variable VAR20 has been defined as both character and numeric.

ERROR: Variable VAR21 has been defined as both character and numeric.

ERROR: Variable VAR22 has been defined as both character and numeric.

ERROR: Variable VAR23 has been defined as both character and numeric.

ERROR: Variable VAR24 has been defined as both character and numeric.

ERROR: Variable VAR25 has been defined as both character and numeric.

ERROR: Variable VAR26 has been defined as both character and numeric.

ERROR: Variable VAR236 has been defined as both character and numeric.

ERROR: Variable VAR20 has been defined as both character and numeric.

ERROR: Variable VAR21 has been defined as both character and numeric.

ERROR: Variable VAR22 has been defined as both character and numeric.

ERROR: Variable VAR23 has been defined as both character and numeric.

ERROR: Variable VAR24 has been defined as both character and numeric.

ERROR: Variable VAR25 has been defined as both character and numeric.

ERROR: Variable VAR26 has been defined as both character and numeric.

ERROR: Variable VAR28 has been defined as both character and numeric.

ERROR: Variable VAR29 has been defined as both character and numeric.

ERROR: Variable VAR30 has been defined as both character and numeric.

ERROR: Variable VAR31 has been defined as both character and numeric.

ERROR: Variable VAR32 has been defined as both character and numeric.

ERROR: Variable VAR33 has been defined as both character and numeric.

ERROR: Variable VAR34 has been defined as both character and numeric.

ERROR: Variable VAR244 has been defined as both character and numeric.

23340  VAR20=VAR20*1;

23341  VAR21=VAR21*1;

23342  VAR22=VAR22*1;

23343  VAR23=VAR23*1;

23344  VAR24=VAR24*1;

23345  VAR25=VAR25*1;

23346  VAR26=VAR26*1;

23347  VAR28=VAR28*1;

23348  VAR29=VAR29*1;

23349  VAR30=VAR30*1;

23350  VAR31=VAR31*1;

23351  VAR32=VAR32*1;

23352  VAR33=VAR33*1;

23353  VAR34=VAR34*1;

23354  VAR236=VAR236*1;

23355  VAR244=VAR244*1;

23356  by UserId;

23357  run;



NOTE: Character values have been converted to numeric values at the places given by:


      (Line):(Column).


      23354:8   23355:8


NOTE: Numeric values have been converted to character values at the places given by:


      (Line):(Column).


      23354:14   23355:14


NOTE: The SAS System stopped processing this step because of errors.


WARNING: The data set WORK.COMBINED may be incomplete.  When this step was stopped there were 0


         observations and 316 variables.


WARNING: Data set WORK.COMBINED was not replaced because this step was stopped.


NOTE: DATA statement used (Total process time):


      real time           0.01 seconds


      cpu time            0.01 seconds

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

This is a problem with using EXCEL to store your data.  It is a spreadsheet and not a database and so any cell could have any type of data.  Thus SAS (or more likely the Jet Engine from Microsoft that it calls) tries to make a guess as to the variable types based on the data that it sees.

First try saving the files as .XLSX files and import those.

Otherwise I find it much easier to save the files as CSV files and then input them using a SAS data step.  That way I can define the variable types.

Since you are just putting them together anyway and you already took out the column headers you could probably read all three in one data step.

filename all ('pilot.csv','full.csv','recent.csv');

data combined ;

  infile all dsd truncover lrecl=30000 ;

  input var1-v316 ;

run;

If you need to define some of them as character then add a LENGTH statement before the INPUT.  For example if VAR1 is character ID variable then you might add LENGTH VAR1 $10;

If some are dates or times then add INFORMAT and FORMAT statements for those variables.

View solution in original post

4 REPLIES 4
Tom
Super User Tom
Super User

This is a problem with using EXCEL to store your data.  It is a spreadsheet and not a database and so any cell could have any type of data.  Thus SAS (or more likely the Jet Engine from Microsoft that it calls) tries to make a guess as to the variable types based on the data that it sees.

First try saving the files as .XLSX files and import those.

Otherwise I find it much easier to save the files as CSV files and then input them using a SAS data step.  That way I can define the variable types.

Since you are just putting them together anyway and you already took out the column headers you could probably read all three in one data step.

filename all ('pilot.csv','full.csv','recent.csv');

data combined ;

  infile all dsd truncover lrecl=30000 ;

  input var1-v316 ;

run;

If you need to define some of them as character then add a LENGTH statement before the INPUT.  For example if VAR1 is character ID variable then you might add LENGTH VAR1 $10;

If some are dates or times then add INFORMAT and FORMAT statements for those variables.

PGStats
Opal | Level 21

Importing data from Excel worksheets into a database structure such as SAS is a thorny issue. The problem is that the data type of each column is determined by Excel and it only scans the first 8 lines of data to do so. You can change that behaviour (you only have to do this once) by modifying the typeGuessRows entry to zero in the Windows registry. When typeGuessRows=0, all data lines are scanned to determine column data types. The procedure is described here:

SAS/ACCESS(R) 9.3 Interface to PC Files: Reference

PG

PG
Theo_Gh
Obsidian | Level 7

I know it sounds something else but where is the windows registry ? Thank you

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 13970 views
  • 3 likes
  • 4 in conversation