BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SASUserRocks
Calcite | Level 5

Hi,

 

I am trying to do proc import from Excel workbook to load some empty fields which will be later added with integer values. The dataset is guessing rows in a way that string datatype is assigned to these variables. Please help.

 

Thanks

Anjali

1 ACCEPTED SOLUTION

Accepted Solutions
smantha
Lapis Lazuli | Level 10
%macro abc;
proc options option=validvarname;
run;

proc import replace datafile= "/SASAnalytics/DA/Daily_File_20200516.xlsx"
out=DAShare2.DSN_TEMP
DBMS=XLSX;
run;
proc export data=NEW dbms=csv

outfile="temp.csv"

replace;

run;
%let varlist= cookie $28 date $10 timestamp $28 Last_url_visited $800 ip $28 stage $28 device $28 os $28 browser $28 Page $28 Product $28
SubProduct $120 Pname $8 cookieBirth $8 typnmarr $28 dslv 5 last_10_urls_visited $400 taxonomy $400 dsdo 8 clickid $28 DSLV_cards 8
DSLV_accounts 8 DSLV_loans 8 DSLV_personal_loan_expats 8 DSLV_bca_silver 8 DSLV_betaqti_credit_card 8 DSLV_etihad_GA_account 8
DSLV_credit_cards 8 DSLV_touchpoint_ic 8 DSLV_privilege_C_Plat_DC 8 DSLV_millionaire_D_Saving_Acc 8 DSLV_traveller_credit_card 8
DSLV_adcb_titanium_debit_card 8 DSLV_touchpoint_platinum_cc 8 DSLV_fixed_deposit_account 8 DSLV_active_saver_account 8 DSLV_excellence_plat_DC 8
DSLV_etihad_guest_Abo_Exce 8 DSLV_personal_loan_uaeNationl 8 DSLV_etihad_guest_Infin_CC 8 DSLV_personal_finance_for_expats 8
DSLV_personal_finance_UAENation 8 DSLV_etihad_guest_privilege_acc 8 DSLV_etihad_guest_excellency_acc 8 DSLV_etihad_platinum_credit_card 8
DSLV_etihad_guest_accounts 8;
data dsn;
informat <put your varlist here> ;
infile "test.csv" <options> ;
input <put your varlist here> ;
set DAShare2.DSN_temp;
 run;
%mend;
%abc;

View solution in original post

23 REPLIES 23
SASUserRocks
Calcite | Level 5

@Kurt_Bremser  There are around 200 xlsx , I wont be able to manually convert them to csv.

Kurt_Bremser
Super User

@SASUserRocks wrote:

@Kurt_Bremser  There are around 200 xlsx , I wont be able to manually convert them to csv.


Using the worst imaginable file format for mass data transfer is at best stupid. No matter what you do, you end up with manual conversions and/or fixes, because every single import will end up with more or less subtle differences.

You might be able to automate a xlsx to csv conversion with LibreOffice or VBA.

 

Have the data sent in a sensible format in the first place.

Excel is only good for manual tinkering, not for automatable (and reliably automatable) processes.

Jagadishkatam
Amethyst | Level 16

Please try to use the mixed=yes option to avoid the issue

 

PROC IMPORT OUT= mydata
     DATAFILE= "J:\users\nsouli\Debugging.xls" 
     DBMS=EXCEL REPLACE;
     SHEET="'SAS$'"; 
     GETNAMES=YES;
     MIXED=yes;     
RUN;
Thanks,
Jag
SASUserRocks
Calcite | Level 5

@Jagadishkatam Tried below code and its throwing error "ERROR: DBMS type EXCEL not valid for import."

 

proc import replace datafile= "/SASAnalytics/DA/Daily_File_20200516.xlsx"
out= DAShare2.Test
dbms=excel ;
DBDSOPTS= "DBTYPE=(DSLV_personal_finance_UAENation='NUM(8)')";
MIXED=yes;

run;

ed_sas_member
Meteorite | Level 14

Hi @SASUserRocks 

 

You can try to convert your excel file in csv / txt. Then you can use a DATA step to import your file and specify the type of your variables with the correct informat in an INPUT statement.

 

Best,

SASUserRocks
Calcite | Level 5

There are huge number of xlsx files that I will receive everyday and not willing to create overhead at SAS server.

smantha
Lapis Lazuli | Level 10

Anjali,

You keep mentioning large amount of excel files is it 100s, 1000s. I hope you do not process them manually. Unfortunately SAS has minimal control on reading data from excel as opposed to other formats. If you know what the fields are and what the formats should be you can do explicit conversion after reading it into a sas dataset.

SASUserRocks
Calcite | Level 5

Hi @smantha  , Currently its 200, but this is daily incremental file that keeps coming till future date. I am trying below code to work.

 


options validvarname=any;
%macro abc;
proc import replace datafile= "/SASAnalytics/DA/Weblog_Lemnisk/ProdData/NewSet/Daily_File_20200516.xlsx"
out=DAShare2.DSN_TEMP
DBMS=XLSX;

%let varlist= cookie $20 date $10 timestamp $20 Last_url_visited $800 ip $20 stage $20 device $20 os $20 browser $20 Page $120 Product $120
SubProduct $120 Pname $120 cookieBirth $20 typnmarr $120 dslv 5 last_10_urls_visited $400 taxonomy $400 dsdo 20 clickid $120 DSLV_cards 20
DSLV_accounts 20 DSLV_loans 20 DSLV_personal_loan_expats 20 DSLV_bca_silver 20 DSLV_betaqti_credit_card 20 DSLV_etihad_GA_account 20
DSLV_credit_cards 20 DSLV_touchpoint_ic 20 DSLV_privilege_C_Plat_DC 20 DSLV_millionaire_D_Saving_Acc 20 DSLV_traveller_credit_card 20
DSLV_adcb_titanium_debit_card 20 DSLV_touchpoint_platinum_cc 20 DSLV_fixed_deposit_account 20 DSLV_active_saver_account 20 DSLV_excellence_plat_DC 20
DSLV_etihad_guest_Abo_Exce 20 DSLV_personal_loan_uaeNationl 20 DSLV_etihad_guest_Infin_CC 20 DSLV_personal_finance_for_expats 20
DSLV_personal_finance_UAENation 20 DSLV_etihad_guest_privilege_acc 20 DSLV_etihad_guest_excellency_acc 20 DSLV_etihad_platinum_credit_card 20
DSLV_etihad_guest_accounts 20;

data dsn;
  length &varlist ;
  set DAShare2.DSN_temp;
  format _all_;
 run;
%mend;
%abc;

smantha
Lapis Lazuli | Level 10

First you can check if the formats of variable from the excel read is consistent with what you expect. If not then write out explicit conversions. 200 files is not that many, being xlsx i can guess at most a million rows, unless the number of sheets you read is more than 5. Can you request whoever is generating these files to write as csv?

Kurt_Bremser
Super User

20 is not a valid length for numeric variables (maximum is 8). Your code will also not work if a column was imported with the wrong type (numeric vs. character).

I can only repeat: from 20+ years of SAS experience, using Excel files for this is madness.

SASUserRocks
Calcite | Level 5

Accept that fact. We will get files in CSV format in future , until then I have to process xlsx files for the dashboard.

Tom
Super User Tom
Super User

Close.  Try this instead.  Use PROC IMPORT to read it in. Use data step to write it out to temporary CSV file. Then use another data step to read in the data from the CSV file.

 

The VALIDVARNAME=ANY doesn't really do you any good, none of the names you want to use require it. The length for numeric variables should be 8 since SAS stores them as 8 byte floating point numbers.  If you have any actual date, time or datetime values then you will want to add a FORMAT statement to the data step that creates the final dataset from the temporary csv file.  But you don't need to write formatted values to the temporary CSV file, so no need for any INFORMAT statements.

%macro abc(infile,dsn);
* Import sheet ;
proc import  dbms=xlsx out=_temp_ replace
  datafile="/SASAnalytics/DA/Weblog_Lemnisk/ProdData/NewSet/&infile..xlsx"
;
run;
* Write to CSV file ;
filename csv temp;
data _null_;
  file csv dsd ;
  set _temp_;
  put (_all_) (+0);
  format _numeric_ best32. _character_;
run;

* Read from CSV file ;
data &dsn;
  length
    cookie $20
    date $10
    timestamp $20
    Last_url_visited $800
    ip $20
    stage $20
    device $20
    os $20
    browser $20
    Page $120
    Product $120
    SubProduct $120
    Pname $120
    cookieBirth $20
    typnmarr $120
    dslv 8
    last_10_urls_visited $400
    taxonomy $400
    dsdo 8
    clickid $120
    DSLV_cards 8
    DSLV_accounts 8
    DSLV_loans 8
    DSLV_personal_loan_expats 8
    DSLV_bca_silver 8
    DSLV_betaqti_credit_card 8
    DSLV_etihad_GA_account 8
    DSLV_credit_cards 8
    DSLV_touchpoint_ic 8
    DSLV_privilege_C_Plat_DC 8
    DSLV_millionaire_D_Saving_Acc 8
    DSLV_traveller_credit_card 8
    DSLV_adcb_titanium_debit_card 8
    DSLV_touchpoint_platinum_cc 8
    DSLV_fixed_deposit_account 8
    DSLV_active_saver_account 8
    DSLV_excellence_plat_DC 8
    DSLV_etihad_guest_Abo_Exce 8
    DSLV_personal_loan_uaeNationl 8
    DSLV_etihad_guest_Infin_CC 8
    DSLV_personal_finance_for_expats 8
    DSLV_personal_finance_UAENation 8
    DSLV_etihad_guest_privilege_acc 8
    DSLV_etihad_guest_excellency_acc 8
    DSLV_etihad_platinum_credit_card 8
    DSLV_etihad_guest_accounts 8
  ;
  infile csv dsd truncover ;
  input (_all_) (+0);
 run;
%mend;

*options validvarname=any;
%abc(Daily_File_20200516,dsn1)
%abc(Daily_File_20200517,dsn2)

 

ballardw
Super User

How similar in structure are these files? This means column order with same intended values/types?

How many header rows (a major bane of Proc Import)? Does this number change in files of the same "structure".

 

If the files have a similar structure you might be able to use either LIBNAME PCFILES or EXCEL to access the data. Then you would have meta data information available available from the SASHELP.Vcolumns or Dictionary.Columns when using Proc SQL. Then write/ modify code to create new SAS datasets as needed, variable names, types, labels and such using data step code. But if you have lots of different layouts this can get cumbersome because you will have to have code that examines pretty much each column of the spreadsheet with the desired type and implement the correct code for conversion.

 

The details are important though. If the column orders or numbers of header rows change then you can get into very complicated headaches in attempting to automate. I have one source of XLSX I have to deal with where the 1) column orders change from month to month, 2) the number of header rows can change from month to month and 3) the "values" of a file or personal identification value will change to DATE formats partway through the file. And I have no way I can think of that would "automate" that conversion with out testing every single value in every single column.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 23 replies
  • 3444 views
  • 0 likes
  • 7 in conversation