BookmarkSubscribeRSS Feed
robeldritch
Obsidian | Level 7

Hi folks,

 

I am attempting to read an xlsx file into SAS without SAS doing anything to modify the the variable types or the characters as they appear when viewing the xlsx file in Excel. In other words, if something looks like a date, regardless of how it's stored in the xlsx file (e.g., "11/25/2023"), it'll show up in sas as a character variable as that same value (e.g., "11/25/2023"). In other words, I'd like to replicate what the Import Data wizard in SAS does when it just defines all fields as character, as in the following attributes for one particular column, for example:

 

type=String

Source Informat=$CHAR29

Len.=29

Output Format=$CHAR29.

Output Informat=$CHAR29.

 

The code automatically generated from the Import Data wizard when I do this contains the DATALINES4 statement, which I can't use because I can't just fill all of these in, as I'm trying to load the data in without having to look at the contents. I'd be fine, though, if I can define the variable types/lengths/formats/etc., as I know what the variables are and what they'll be named.

 

How do I do this without having to fill in the DATALINES4 content?

 

I need to be able to specify the sheet within the Excel workbook in question as well.

 

Thanks!

 

 

16 REPLIES 16
ballardw
Super User

Proc Import file="<path>\your excel file.xlsx" out=yourdatasetname

     dbms=Excel replace;

     mixed=yes;

run;

 

Maybe.

The documentation states (emphasis added):

MIXED=YES | NO

assigns a SAS character type for the column and converts all numeric data values to character data values when mixed data types are found.

YES specifies that the connection is set to import mode and updates are not allowed. The XLSX and XLS formats assume MIXED=YES.
Note: Due to how the Microsoft ACE driver and the Microsoft Jet Excel driver work, using MIXED=YES could result in improper text variable lengths.
NO assigns numeric or character type for the column, depending on the majority of the type data that is found.

 

 

robeldritch
Obsidian | Level 7

Hi ballardw--thanks for the response. Unfortunately, this didn't end up working for me, as it still brought in some fields as numbers (namely, the date fields I described in my original post, which of course were read in as number of days since whatever date, as well as other fields). That said, I found a really roundabout solution that I'll post below...

Kurt_Bremser
Super User

The import wizard uses two stages: first, it converts the Excel spreadsheet to text on the client side, and then makes this text part of the code.

If you want full control of the import, you must save the data to a text (csv) file, upload it if you use SAS on a server, and then read the file with a DATA step.

robeldritch
Obsidian | Level 7

Hi Kurt,

 

Thanks for your response. Unfortunately, saving each file as a .csv, as much as I'd like to, isn't doable in this situation--the client will be receiving these .xlsx (and some are .xlsm) files (expected to be well over 100) at some point in the near future and won't have time to save all of them as such. I'm not aware of a way to save all the files as a .csv en masse. I tried something in SAS to that effect, but all it ended up doing was saving them with a .csv file extension but where the underlying data were still .xlsx, and this move somehow corrupted them such that they weren't openable even from the file explorer, much less in SAS. I could import them into SAS en masse with the intention of exporting as a .csv, but then I run into the problem that brought me here in the first place. 

 

That said, I found a really roundabout solution, which I'll post below...

robeldritch
Obsidian | Level 7

Ok, so here's the really roundabout solution involving DDE (yup...) I came up with, appropriating some things from a few different resources, including some old SAS Communities threads (I apologize for not citing/tagging the folks who have unbeknownst to them contributed to this ridiculous block of code):

 

%macro a; %do i=1 %to &n;
	options noxwait noxsync;
	%SYSEXEC "H:\misc\equip\data\core_&i..xlsx";

	%let ws=ReOp6-Equip Depr;
		%let r1=1;
		%let c1=1;
	  	%let r2=10000;
	  	%let c2=24;

	  	FileName xl
	    	DDE
	      	"Excel|&ws!r&r1.c&c1.:r&r2.c&c2."
	       	LRecL=5000
	       	NoTab;

	data _null_;  
		length fid rc start stop time 8;  
		fid=fopen('xl');  
		if (fid le 0) then do;    
			start=datetime();    
			stop=start+20;    
			do while (fid le 0);      
				fid=fopen('xl');      
				time=datetime();      
				if (time ge stop) then fid=1;      
				end;    
			end;  
		rc=fclose(fid); 
	run;

	data core_&i.b;
		Infile XL
		DSD
		Pad
		DLM = "09"x;

	    length Col1-Col%eval(&c2.-&c1.-1) $300;
	    format Col1-Col%eval(&c2.-&c1.-1) $300.;
	    input Col1-Col%eval(&c2.-&c1.-1 );
	run ;

	filename cmds dde 'excel|system';
	data _null_;
	  file cmds;
	  put '[close("H:\misc\equip\data\core_&i..xlsx")]';
	  put '[quit]';
	run;
%end; %mend a; %a

/*THIS THEN REPEATS FOR THE .xlsm WORKBOOKS...*/

It sure ain't very elegant or efficient, but it's the only thing I've tried so far that has actually worked for my purposes, so I'm not messing with it for now.

ballardw
Super User

@robeldritch wrote:

Ok, so here's the really roundabout solution involving DDE (yup...) I came up with, appropriating some things from a few different resources, including some old SAS Communities threads (I apologize for not citing/tagging the folks who have unbeknownst to them contributed to this ridiculous block of code):

 


IF you put this into a production process you will need to be aware that DDE is no longer supported and that the internal communications channels used by DDE may be preempted by other programs. I had one program that used DDE to "get around" opening files. Until it didn't work. Because our organization started using Cisco Jabber.

The processes on the system running SAS had to have Jabber KILLED, not just turned off, to allow DDE to work. That was 7 years ago. I cannot believe that there are fewer programs that will interfere with DDE now.

 

 

 

robeldritch
Obsidian | Level 7

If there are such programs that conflict with the DDE's requirements for establishing/maintaining a connection in our organization, they're not conflicting here. The only thing I'm doing (temporarily) to ensure the connection is established and maintained (until I close it) is to relax the protected view Trust settings when generating copies of the files in the folder (also done in SAS). This is just a one-time thing--I just want to be able to have it work this one time and as simply as possible (i.e., the less applications involved, the better). I'm appending specific row/column ranges from each sheet to a master table and inserting a column in each one with another value from elsewhere on each sheet as an identifier before appending--all of which I know how to do easily in SAS. So if I can do all the steps with one script in SAS, then that's more ideal for my situation.

Ksharp
Super User
proc import datafile='c:\temp\temp.xlsx' out=class dbms=excel replace;
sheet='xxxx';
dbdsopts="dbsastype=(age='char(20)' weight='char(20)' height='char(20)')";
run;

You can specify this kind of variable to be character type as well with EXCEL engine.

robeldritch
Obsidian | Level 7

Hi Ksharp,

 

Thanks for this--I had tried this solution already, and it's not giving me what I want. For example, even though it's bringing the date column in as a character variable, it's still just in the form of days since such and such date but as a string instead of a numerical variable. 

Tom
Super User Tom
Super User

@robeldritch wrote:

Hi Ksharp,

 

Thanks for this--I had tried this solution already, and it's not giving me what I want. For example, even though it's bringing the date column in as a character variable, it's still just in the form of days since such and such date but as a string instead of a numerical variable. 


This is the behavior that is causing the issue for you.  When SAS decides that a date (or datetime they are really the same in Excel since time of day is just stored as a fraction of day) value needs to be converted into a character variable it does not apply the Excel formatting to it.  Instead it just converts the underlying number (the value that is actually stored in the cell) into a character string.  The same thing happens with other EXCEL display formats.  For example currency displays.

 

Take this sheet:

Tom_0-1716055315063.png

If I use PROC IMPORT to read that then because the last row has character strings in every column all three variables will be character.

Tom_1-1716055370973.png

 

Your example of using DDE is one way to have SAS control EXCEL to do what you want.  But DDE is a deprecated technology and can by hijacked by random other processes on your PC making it not a reliable choice.

 

But for DDE to work you have to be running EXCEL locally anyway so you could instead use Visual Basic or Powershell commands instead to convert the sheets in the XLSX workbooks into CSV (or other text) files that could then be read by SAS as all character variables very easily. 

Tom
Super User Tom
Super User

If you know what display format you want applied to the values then just add a step to do that.

 

For example if your generated dataset is named FROM_EXCEL and the variable created from the column in the spreadsheet with the DATE values is named DATE you might want to run something like this:

data want;
  set from_excel;
  if not missing(input(date,??32.)) then date=put(input(date,32.)+'30dec1899'd,mmddyy10.);
run;
robeldritch
Obsidian | Level 7
Hi Tom,

I wish I knew what date format I wanted! The issue is that I don't, and the sheets I'm reading in are all manner of formats, including erroneous ones (e.g., 23/17/24). The client just wants everything preserved exactly as it's displayed on the Excel sheet (I realize the underlying data may look different). I'd be curious to see how this might be done with the other tools you mentioned above (power shell, etc). I realize DDE isn't ideal for the reasons you described, but it was the only solution that worked for me so far. Thanks for your input!
Tom
Super User Tom
Super User

Qucik google search found this StackOverflow question that has a visual basic script for converting all worksheets in all XLSX files in a folder into text files.

 

https://stackoverflow.com/questions/17141885/batch-convert-excel-to-text-delimited-files

 

Patrick
Opal | Level 21

@robeldritch 

Based on your answers it appears you're dealing with somewhat conflicting requirements and expectations. It might be worth for you to have a discussion with your customer and explain some of the differences between an Excel spreadsheet and a (SAS) table. 

You simply can't have source date strings that don't represent a valid date and then expect to store this as a SAS data value (that just MUST be a valid date). You also can't apply formats "cell based" like in Excel.

The only way I can think of to get strings in a SAS table as seen in an Excel spreadsheet is to save the Excel sheets as csv using the Excel save as... function and then use SAS to read these csv files into SAS character variables (and only character variables). 

 

@Tom already shared a link with you for a vbs script that does such conversion to csv files. I've also made a test with a chatGPT generated script which worked quite well from start (Microsoft knows Microsoft...).

 

If you want to take above avenue then for automation first questions are:

- Do you have option XCMD set?

- Can you confirm that your SAS environment executes under Windows.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 16 replies
  • 826 views
  • 6 likes
  • 6 in conversation