BookmarkSubscribeRSS Feed
manikanta9500
Calcite | Level 5

Hi All,

I am New to this forum, but I am hoping can get a reply to my question fairly quickly.

 

Requirement :

 

1) Import excel file present location into SAS dataset ( file name wave1) 

Note : Excel  is refreshed on daily  bases

2) Current SAS dataset is an exact replica of the Excel wave1. So if there are any columns missing in excel which was previously present for some reason , these columns will be missing from the SAS dataset too. However, Spotfire marks it as an invalid dataset since few columns are missing. 

To overcome this we have placed previous created SAS dataset in backup folder and using below code retain variable structure.

 

After importing into excel file into SAS data set I am using below code

 

data final;

set new previous (obs=0);

run;

 

Above code work fine sometime and it will throw below error some time 

Error : Variable has been defined as both character and numeric.

 

Please help me to resolve this error permanently .

 

Thanks in advance .

 

 

4 REPLIES 4
andreas_lds
Jade | Level 19

The best way:

  • don't accept excel as data source
  • use csv instead 
  • don't use proc import to read csv, but write your own data step
ballardw
Super User

Proc Import makes guesses as to variable types based on the content of the file. Separately for each file.

So when you "import" multiple different files that have the same structure because Excel does not enforce any rules as to content of cells you may have different types of data in the same column from different files. So the result is different variable types and different lengths for character variables (which can cause truncation of data).

 

Also if someone inserts an additional row of headers that second row of headers now becomes data and you might even have completely different names of variables.

 

Most of the time saving the file to CSV (there are several choices in the Excel menu under File-Save as) will create a text file that a SAS data step can read in a consistent manner. That way all of the variables have the same type, length and properties such as variable name as well for each file. The provider of the files should be able to tell you which columns are supposed to be text and how many characters, which numeric and which date or time values. Then you can set the proper type and properties in the SAS data step code.

 

Save the file as CSV, run Proc import for the CSV file using the option GUESSINGROWS=MAX to have the procedure look at more rows before guessing the properties for the variables. The LOG will show a generated data step that you can copy to the Editor and clean up: remove line numbers, verify informats (character length and if the type is correct). SAVE that program as a basis for the code. Next file you change the name of the input file and maybe the output SAS data set, depending on how you want to work.

Kurt_Bremser
Super User

If you expect ANY kind of consistency, then you simply MUST NOT use Excel files, period.

Excel files force SAS to make guesses about column attributes, and that can fail for numerous reasons.

 

  • Set up a data dictionary that clearly and unambiguously describes all columns
  • Accept only text files as transfer format
  • Write a data step that reads the files according to the dictionary, and fails with a clear message if invalid data is encountered
Tom
Super User Tom
Super User

Who is refreshing the EXCEL file? How are they refreshing it?  Can you have them create a delimited text file instead?

 

If you cannot you might try creating the delimited text file yourself.  That might let you handle the new files with little change to your current SAS program.  Just add a step to write the imported EXCEL file out as text and then read it using the structure of the target.  This should fix most of the problems with wrong variable types.  

So convert he EXCEL file to a CSV file.

proc import datafile='newfile.xlsx' out=newfile replace;
run;
filename csv temp;
data _null_;
  set newfile;
  file csv dsd termstr=crlf ;
  put (_all_) (+0);
run;

Now use a data step to read the CSV file.  You could use the structure of your PREVIOUS file to drive the reading.  Make sure that proper informats are attached to any variable that might need them.  Usually only DATE/TIME or DATETIME variables need special informats, but if you have fields with commas and/or dollar signs that you want read as numbers you need to use the COMMA informat.

data final;
  if 0 then set previous;
  infile csv dsd truncover termstr=crlf;
  input (_all_) (+0);
run;

Note this will not work if they change the number or order of the variables (columns) in the file.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1983 views
  • 0 likes
  • 5 in conversation