BookmarkSubscribeRSS Feed
ybz12003
Rhodochrosite | Level 12

Hello,

I use a macro to import 8 Excel files with 5 spreadsheets, then I combine all the sheets based on the different years.   I found that I could not do the merge due to lots of variables are not in the same format.  Please see the sample codes and error messages below.

%let Y1=1617;
%let Y2=1718;
%let Y3=1819;
%let X1=VAN;
%let X2=ROC;
%let X3=CIN;

%let X8=PIT;

%macro import;

%do i  = 1 %to 3;

 %do z  = 1 %to 3;

PROC IMPORT OUT=&&X&i.._&&Y&z.
			datafile="Pathwasy\Underly_Con_&&X&i.."
			dbms=xlsx replace;
			sheet=&&X&i.._&&Y&z..;
run;

%end; 
%end; 
%mend;
%import;

data underly_1617;
	set cin_1617 hou_1617 kc_1617 pit_1617 roc_1617 sea_1617 van_1617;
run;

proc sort data=underly_1617; by studysite caseid; run;

Brief error messages are shown below, more than twenty variables had both character and numeric issues when I tried to merge in the last step.   I am curious whether I should format those variables into numeric while I did the macro import; or get around of both character and numeric issue while I combine the subset.   Please advise, thank you.

 

70 data underly_1617;
71 set cin_1617 roc_1617  van_1617;
ERROR: Variable uapnea has been defined as both character and numeric.
ERROR: Variable ugerd has been defined as both character and numeric.
ERROR: Variable uprem has been defined as both character and numeric.
ERROR: Variable ugestage has been defined as both character and numeric.
72 run;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.UNDERLY_1617 may be incomplete. When this step was stopped there were 0
observations and 49 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds

 

3 REPLIES 3
Tom
Super User Tom
Super User

The best way is to not use XLSX files as your data source.  Spreadsheets are not databases. Each cell is independent.

Either use a SAS datasets or some other format that actually creates structured data.

Or use a CSV file or other text files and write your own data step(s) to read the files so that you have control.

 

You might also want to look into why variables are getting defined with the wrong type.  What is the correct type for the variable?  If it is supposed to be character make sure all of the cells in that column contain text instead of numbers. If the issue is just a couple of cells it might faster make a copy of the XLSX file and clean it up before trying to import it.

 

You might try exporting the sheets to CSV files yourself and then using data steps to read them.

If that is not workable you might try taking the datasets you got from PROC IMPORT and writing them out as CSV files from SAS.  That has the advantage of being programmable.   But then you need to make sure the columns in the sheets are in the same order.  But if the PROC IMPORT step has caused the values to be read wrong you might not notice the change For example if you have a date cell in a column that needs to be character in SAS then you get a digit string that represents the internal number that EXCEL uses for that date instead of something a human would recognize as a date.

Kurt_Bremser
Super User

That's why using Excel files as data source is always a BAD IDEA. Save the sheets to csv files (or require such in the first place), so you can read them all with the same DATA step.

Quentin
Super User

As others have said, Excel is bad. : )

 

That said, if you want to use the XLSX engine, I think you will be stuck doing something like:

  1. Import a sheet
  2. Check the variable types (using proc contents or whatever)
  3. Use a DATA step to change the variable types to what you want them to be.

That can all be automated, but it's a hassle.  The problem is that Excel doesn't have variable types, so SAS needs to look at the values in a column of cells and guess at a variable type.  

 

This sort of importing work would be much easier if XLSX engine supported the DBSASTYPE option.  Then you could use the option to specify the variable type to be used when reading into SAS.  I submitted a ballot item with that suggestion.  If you agree, please upvote it:  https://communities.sas.com/t5/SASware-Ballot-Ideas/Add-DBSASTYPE-or-similar-option-to-XLSX-engine-t...

 

If you can use the older EXCEL or PCFILES engines, they do support the DBSASTYPE option.

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 937 views
  • 2 likes
  • 4 in conversation