Hi....I have 26 subfolders in a folder and each subfolder has a password protected excel file. I would like to retrieve the data from a specific worksheet in each of these excel files into one dataset. The name of the worksheets are the same on each worksheet. I do have PC Files. How can I do this...thanks.
Nope. Excel/SAS don't work that way.
If the cell in Excel has a date value (like the 43,921 value in your first row) but some other cells in that column have character strings (even ones that look to you like they are dates like "2020-03-31" or "31MAR2020") then SAS will make a CHARACTER variable and put into the variable a character representation of the number used for that date, like "43921".
You can convert that string into a number once you have it in SAS and then adjust the number to the number that SAS uses for that date and attach date type format so that humans will recognize the value. So if the variable created for that column is named CHAR you can make a new variable named NUM that has SAS date values and is formatted with the DATE9 format for human consumption with a data step like this:
data want;
set have;
num = input(char,32.) + '30DEC1899'd ;
format num date9.;
run;
The program might need to be more complicated because you probably do have strings like the examples I listed above. So to convert those to numbers you need to use an appropriate INFORMAT.
data want;
set have;
num = input(char,??32.) ;
if not missing(num) then num=num + '30DEC1899'd ;
else if not missing(char) then num=input(char,anydtdte.);
format num date9.;
run;
No...this will have to be done each week as each excel files is updated daily by other user to be used for weekly reports.
I would investigate some other "secure" data transfer if at all possible.
I've had too many instances where the "password" I was provided wouldn't work with a once a month file. 26 files a week makes me cringe.
Plus the added headaches of the inconsistencies of entirely too many Excel data set contents and trying to read them. Is this "other user" manually changing these? Nothing like someone using cell background colors to impart "information".
I see a process that is going to require a lot of maintenance to keep working.
Each excel file is from the same template one for each of 26 products. End users only populate the excel files with new information becomes available for the specific product. I think the reason it was decided to password protect the excel files was a way to control of new information being entered so that no two users could entered new information at the same time and overriding the other users entered information.
There's this: https://www.sascommunity.org/mwiki/index.php?title=Streamlining_Data-Driven_SAS_With_The_%FOR_Macro
One of the examples is: Import all spreadsheets in all subfolders of a top folder
I have no idea how to get it to handle passwords.
If having the files password protected a necessity, would having the worksheet that is needed from each excel file saved as an Access File and password protected, would working with Access files that are password protected easier to work with?
Hi.....I am getting an error message when I am trying to merge all the datasets together which I think is related to a decimal point being attached at the end of each dataset that is being imported.
%let path=V:\Test; filename folder "&path\"; libname project 'I:\Output'; libname c 'I:\Output\Temp'; options validmemname=extend; data FilesInFolder; length Line 8 File $300; List = dopen('folder'); do Line = 1 to dnum(List); File = trim(dread(List,Line)); output; end; drop list line; run; data _null_; set FilesInFolder end=final; call symput(cats('File', _N_), trim(File)); call symput(cats('Name', _N_), trim(nliteral(substr(File,1,min(32, length(File)-4))))); if final then call symputx(trim('Total'), _N_); run; %macro loop; %do i = 1 %to &Total; proc import datafile="&path\&&File&i" out=c.&&name&i dbms=excel replace; sheet="F2020"; range="$B10:AN200"; getnames=yes; run; %end; %mend loop; %loop proc sql noprint; select trim(libname) || '.' || memname into :dataset_vars separated by ' ' from dictionary.tables where upcase(libname)="C" and upcase(memname) like '%' ; quit; data want; set &dataset_vars; run; Error Message: 36 37 data want; 38 set &dataset_vars; NOTE: Line generated by the macro variable "DATASET_VARS". 38 C.CF. C.NCT. C.WANT _____ 22 201 ERROR 22-322: Syntax error, expecting one of the following: a name, a quoted string, ;, CUROBS, END, INDSNAME, KEY, KEYRESET, KEYS, NOBS, OPEN, POINT, _DATA_, _LAST_, _NULL_. ERROR 201-322: The option is not recognized and will be ignored.
Looks like you are generating NAME by chopping some characters off of the end of FILE.
call symput(cats('Name', _N_), trim(nliteral(substr(File,1,min(32, length(File)-4)))));
You probably want to instead just take the characters up to the period.
call symputX(cats('Name', _N_), nliteral(substrN(scan(File,1,'.'),1,32)));
Note that using the modern (10-15 years old?) call symputX() instead old call symput() simplifies things. Using substrN() also simplifies truncating to max of 32 bytes.
Thanks Tom....that did the trick....I have included the "usedate=yes" in the proc import and I am still getting variables import both as character and numeric. Isn't by including this statement in the proc import to prevent this as it doesn't matter to me which one but not both.
I don't use DBMS=EXCEL, I use DBMS=XLSX (or they are old files DBMS=XLS).
I doubt if usedate has any impact on whether variables are defined as numeric or characters. That is probably control by how you defined the cells in the excel file. If any of the cells in a column contain text then the whole column has to be defined as character. SAS can convert the numbers into strings, but it cannot convert the strings into numbers (at least not in a reproducible,meaningful way). If you want the variable to be a numeric variable with date values and a date format attached to it then make sure all of the cells in the spreadsheet have dates in them and not character strings.
Hi Tom....I did change the DBMS to XLSX and it worked but the dates that being imported are being imported numerically. When re-format the numeric dates, the results are not correct. I extracted a few dates that was in the excel file that was being imported and the results when re-formatting. The reformatted dates seem to reformat one day later and 60 years later.
Entered Displayed Imported Numeric Formatted 2020-03-31 31-Mar-20 43921 01Apr2080 2019-11-30 30-Nov-19 43799 01Dec2079 2021-07-10 10-Jul-21 44387 11Jul2081 2020-09-18 18-Sep-20 44092 19Sep2080 2021-07-10 30-Dec-20 44195 31Dec2080
Nope. Excel/SAS don't work that way.
If the cell in Excel has a date value (like the 43,921 value in your first row) but some other cells in that column have character strings (even ones that look to you like they are dates like "2020-03-31" or "31MAR2020") then SAS will make a CHARACTER variable and put into the variable a character representation of the number used for that date, like "43921".
You can convert that string into a number once you have it in SAS and then adjust the number to the number that SAS uses for that date and attach date type format so that humans will recognize the value. So if the variable created for that column is named CHAR you can make a new variable named NUM that has SAS date values and is formatted with the DATE9 format for human consumption with a data step like this:
data want;
set have;
num = input(char,32.) + '30DEC1899'd ;
format num date9.;
run;
The program might need to be more complicated because you probably do have strings like the examples I listed above. So to convert those to numbers you need to use an appropriate INFORMAT.
data want;
set have;
num = input(char,??32.) ;
if not missing(num) then num=num + '30DEC1899'd ;
else if not missing(char) then num=input(char,anydtdte.);
format num date9.;
run;
Hi Tom....is there any way to import all variables from the worksheets as character. I switched to dbms=xlsx and from the search I was able to do indicated that by using dbms=xlsx automatically converts all variables to character. Did I misunderstand something. I asked this because one of the worksheets have numeric variables.
DBMS=XLSX does not have any option to force a type on a column. The type is determined by the type of the cells in the column.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.