BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
LuGa
Obsidian | Level 7

Dear SAS community,

 

According to :

15778 - How to specify formats when importing data from Excel (sas.com)
proc import should allow specifying formats for imports.

However the statement returns an error:

 

PROC IMPORT OUT= WORK.relabel
          DATAFILE= "&path\Variables and Labels(13NOV2023).xlsx"
          DBMS=EXCEL REPLACE;
     GETNAMES=YES;
	FORMAT Member $15.Num 10. NewNum 10. Variable $40. Type $5. Len 5. Format $20. Informat $20. Label $40. NewLabel $40. Del 1. SortNo1 1.;
RUN;
37   PROC IMPORT OUT= WORK.relabel
38             DATAFILE= "&path\Variables and Labels(13NOV2023).xlsx"
SYMBOLGEN:  Macro variable PATH resolves to XXX
39             DBMS=EXCEL REPLACE;
40        GETNAMES=YES;
41       FORMAT Member $15.Num 10. NewNum 10. Variable $40. Type $5. Len 5. Format $20. Informat $20. Label $40. NewLabel $40. Del 1. SortNo1 1.;
42   RUN;

WARNING: FORMAT statement is not supported and will be ignored in this procedure. Use PROC DATASETS to modify the data set instead.
SYMBOLGEN:  Macro variable _IMEXSERROR_ resolves to SERROR
NOTE: WORK.RELABEL data set was successfully created.
NOTE: The data set WORK.RELABEL has 3459 observations and 12 variables.
NOTE: PROCEDURE IMPORT used (Total process time):
      real time           0.74 seconds
      cpu time            0.29 seconds

I would like to implement specification sheets for database extracts. The database extracts are .sas7bdat files. The excel sheets should help users what columns they want to have visualized in the final excel workbook. Converting the .xlsx files to .csv is not an option in this case (like in: Solved: PROC IMPORT XLSX File and Specify Formats - SAS Support Communities).

 

Thanks in advance for assistance and best regards

Lukas

1 ACCEPTED SOLUTION

Accepted Solutions
LuGa
Obsidian | Level 7

Hi Kurt, 

 

thanks for pointing that out.

 

Since R (Cell and Column Types • readxl (tidyverse.org))

 

library(readxl)
read_excel("yo.xlsx", col_types = c("date", "skip", "guess", "numeric"))

or Python (Python pandas: how to specify data types when reading an Excel file? - Stack Overflow)

 

 

import pandas as pd
df = pd.read_excel('Book1.xlsx',sheetname='Sheet1',header=0,converters={'names':str,'ages':str})

 

are both capable of handling excel data for professional purposes, I will prefer to use those for the required task.

 

Thanks and BR

Lukas

View solution in original post

9 REPLIES 9
Patrick
Opal | Level 21

I've never seen this syntax from the usage note you reference https://support.sas.com/kb/15/778.html 

It's also a weird note because the table indicates it's a bug without a fix but the code looks like it should be a working solution ....and that for the very old version SAS 9.1.3

It's clearly not in the documentation: https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/acpcref/n0msy4hy1so0ren1acm90iijxn8j.htm 

 

Suggest you use syntax as suggested by the warning text.

WARNING: FORMAT statement is not supported and will be ignored in this procedure. Use PROC DATASETS to modify the data set instead.

 

Proc Datasets changes the descriptor portion of your table in-place. This process will take almost no time and is also independent of the data volume in the table.

I've made the assumption that the format statement in your code contains key/value pairs of variable name and related format - so some variables are called Format or Label. 

If that's not the case but you want to define formats, labels etc. for specific variables then the syntax needs to change. It's all documented here.

PROC IMPORT OUT= WORK.relabel
  DATAFILE= "&path\Variables and Labels(13NOV2023).xlsx"
  DBMS=EXCEL 
  REPLACE;
  GETNAMES=YES;
RUN;

proc datasets lib=work nolist;
  modify relabel;
    FORMAT 
      Member $15.
      Num 10. 
      NewNum 10. 
      Variable $40. 
      Type $5. 
      Len 5. 
      Format $20. 
      Informat $20. 
      Label $40. 
      NewLabel $40. 
      Del 1. 
      SortNo1 1.
      ;
  run;
quit;

 

 

LuGa
Obsidian | Level 7

Hi Patrick,

 

thanks for your reply. Yes it is indeed interesting how inconsistent the documentation of SAS is. 

 

Your approach did not help unfortunately. 

ERROR: You are trying to use the numeric format F with the character variable NewNum in data set WORK.RELABEL.

This is the error from your suggested proc datasets procedure.

 

As asked in my initial question the excel file should act as a configuration file for non-programmers. That means, that a variable such as Del should be initialized as numeric, no matter what is written in the column. 

 

Is this possible in SAS?

Kurt_Bremser
Super User
If you want control over variable types and other attributes, then FORGET the Excel file format and PROC IMPORT. Save the data to a text file and read that with a DATA step.
LuGa
Obsidian | Level 7

Dear Kurt, 

 

thanks for your help.

 

Unfortunately this is exactly what I tried to avoid (see my initial post). 

 

BR
Lukas

andreas_lds
Jade | Level 19

The format-statement assigns formats to the variables written to the dataset, it has no impact on how values are read from a file. So even if the format statement could be used in proc format, it won't fix the problem of types and lengths being guessed by proc import. The only way to avoid such problems is using text files and writing a data step to read the file.

Kurt_Bremser
Super User

In order to have solid, consistent results which are usable in further automation, text files and DATA steps are the way to go. IMPORT is a stopgap, usable for a quick shot, but not for professional purposes.

Since all SAS methods to read Excel files are making guesses based on the current contents, the same holds true. Can be used for a quick shot, but not for professional work.

LuGa
Obsidian | Level 7

Hi Kurt, 

 

thanks for pointing that out.

 

Since R (Cell and Column Types • readxl (tidyverse.org))

 

library(readxl)
read_excel("yo.xlsx", col_types = c("date", "skip", "guess", "numeric"))

or Python (Python pandas: how to specify data types when reading an Excel file? - Stack Overflow)

 

 

import pandas as pd
df = pd.read_excel('Book1.xlsx',sheetname='Sheet1',header=0,converters={'names':str,'ages':str})

 

are both capable of handling excel data for professional purposes, I will prefer to use those for the required task.

 

Thanks and BR

Lukas

Ksharp
Super User
Here is TWO examples:

proc import datafile = "c:\temp\date.xlsx" out =have replace dbms =excel ;
dbdsopts="dbsastype=(age='numeric' weight='char(20)' weight='char(20)')";
run;


libname x excel 'c:\temp\date.xlsx';
data have2;
set x.'date$'n(dbsastype=(age='numeric' weight='char(20)' weight='char(20)'));
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1016 views
  • 3 likes
  • 5 in conversation