Hello,
I have a library of around 50 CSV datasets. I would like to import them all in one step (if possible) rather than importing individual files one by one with proc import. Can anyone help me with this?
PS: In other words, I am trying to avoid doing this 50 times:
PROC IMPORT OUT= WORK.dataset1
DATAFILE= "Path\dataset1.csv"
DBMS=CSV REPLACE;
GETNAMES=YES;
DATAROW=2;
RUN;
Thank you,
MK
Here you go:
%let path=C:\temp;
filename filelist pipe "dir /b &path\*.csv";
data dirList;
infile filelist truncover;
input CSVfilename $100.;
length SASTableName $32;
SASTableName=substrn(scan(CSVfilename,1,'.'),1,31);
/* valid SAS table name? */
if not nvalid(SASTableName) then
do;
/* covert string to valid SAS table name */
SASTableName=prxchange('s/^(\d)|\W+/_\1/oi',-1,strip(SASTableName));
end;
run;
filename codegen temp;
data _null_;
/* file print;*/
file codegen;
set dirList;
put
/ "PROC IMPORT OUT= WORK." SASTableName
/ " DATAFILE= '&path\" CSVfilename +(-1) "'"
/ " DBMS=CSV REPLACE;"
/ " GETNAMES=YES;"
/ " DATAROW=2;"
/ " guessingrows=max;"
/ "RUN;"
;
run;
%include codegen / source2;
What's not covered in above code:
1. substring of 32 characters with all non-valid characters replaced with underscore could lead to duplicate table names (unlikely but possible)
2. Proc Import with DATAROW=2 will throw an error in case of an empty .csv source file
What do you mean by a library of CSV files? If you have 50 CSV files that represent different sets of records of the same data then you should be able to read it with one data step. If you have 50 individual files with their own unique structure then you will need to read them each separately. It should be possible to make the computer generate the 50 PROC IMPORT steps for you.
Both of those situations are common questions on this forum.
Hi--,
I have 50 individual files with their own unique structure and names in one folder (please see attached photo). Each uque datasets have different variables in it. How can we set up the SAS code to repeat the import procedure 50 or so times?
Regards,
MK
Something like below should do.
%let path=C:\test;
Filename filelist pipe "dir /b &path\*.csv";
Data dirList;
Infile filelist truncover;
Input filename $100.;
Put filename=;
Run;
filename codegen temp;
data _null_;
/* file print;*/
file codegen;
set dirList;
put
/ "PROC IMPORT OUT= WORK.dataset_" _n_ z2.
/ " DATAFILE= '&path\" filename "'"
/ " DBMS=CSV REPLACE;"
/ " GETNAMES=YES;"
/ " DATAROW=2;"
/ " guessingrows=max;"
/ "RUN;"
;
run;
%include codegen / source2;
You could also use CALL EXECUTE() instead of writing the code to a temporary file and then execute it via an %INCLUDE statement.
I personally prefer the approach using a temporary file as this allows me to create code which I can easily examine during development without execution by writing it to PRINT.
Hi Patrick,
Thank you for the code! It works like magic. The only thing I would like to edit would be to assign the original CSV dataset names (listed in Dirlist) to the newly created SAS datasets, rather than creating SAS datasets with names dataset_01, dataset_02, etc, which is not really informative. I will try to work on that, but if you have an easy solution, please let me know.
Thank you all for posting solutions.
MK
Here you go:
%let path=C:\temp;
filename filelist pipe "dir /b &path\*.csv";
data dirList;
infile filelist truncover;
input CSVfilename $100.;
length SASTableName $32;
SASTableName=substrn(scan(CSVfilename,1,'.'),1,31);
/* valid SAS table name? */
if not nvalid(SASTableName) then
do;
/* covert string to valid SAS table name */
SASTableName=prxchange('s/^(\d)|\W+/_\1/oi',-1,strip(SASTableName));
end;
run;
filename codegen temp;
data _null_;
/* file print;*/
file codegen;
set dirList;
put
/ "PROC IMPORT OUT= WORK." SASTableName
/ " DATAFILE= '&path\" CSVfilename +(-1) "'"
/ " DBMS=CSV REPLACE;"
/ " GETNAMES=YES;"
/ " DATAROW=2;"
/ " guessingrows=max;"
/ "RUN;"
;
run;
%include codegen / source2;
What's not covered in above code:
1. substring of 32 characters with all non-valid characters replaced with underscore could lead to duplicate table names (unlikely but possible)
2. Proc Import with DATAROW=2 will throw an error in case of an empty .csv source file
This solution is very short and easy to run, it converted all 182 csv datasets into sas datasets. Thank you!
Importing entire directory of XLS file into SAS
This requires full SAS so if yo get a error you may
want to contact your IT dept and show them the error.
This does not solve a CSV problem because it deals with
a directory of excel files. I tried to use 'proc import'
but it crashed SAS. DM commands were ignored.
You have much more flexibilty with excel files.
HAVE
====
Directory of d:\xls
cars.xlsx 04/04/2017 04:53 PM 49,376
class.xlsx 04/04/2017 04:53 PM 9,253
iris.xlsx 04/04/2017 04:53 PM 12,870
WANT
====
WORK.CARS
WORK.CLASS
WORK.IRIS
WORKING CODE
============
rc=filename("mydir","d:\xls");
did=dopen("mydir");
rc=dosubl('
libname xel "d:/xls/&mem..xlsx";
data &mem;
set xel.&mem;
* _ _ _
_ __ ___ __ _| | _____ __| | __ _| |_ __ _
| '_ ` _ \ / _` | |/ / _ \_____ / _` |/ _` | __/ _` |
| | | | | | (_| | < __/_____| (_| | (_| | || (_| |
|_| |_| |_|\__,_|_|\_\___| \__,_|\__,_|\__\__,_|
;
CREATE A DIRECTORY OF EXCEL FILES
=================================
dm "dexport sashelp.class 'd:\xls\class.xlsx' replace";
dm "dexport sashelp.cars 'd:\xls\cars.xlsx' replace";
dm "dexport sashelp.iris 'd:\xls\iris.xlsx' replace";
* _ _ _
___ ___ | |_ _| |_(_) ___ _ __
/ __|/ _ \| | | | | __| |/ _ \| '_ \
\__ \ (_) | | |_| | |_| | (_) | | | |
|___/\___/|_|\__,_|\__|_|\___/|_| |_|
;
%symdel mem / nowarn;
data _null_;
rc=filename("mydir","d:\xls");
did=dopen("mydir");
if did > 0 then do;
memcount=dnum(did);
do i=1 to memcount;
name=dread(did,i);
call symputx('mem',scan(name,1,'.'));
put name=;
rc=dosubl('
libname xel "d:/xls/&mem..xlsx";
data &mem;
set xel.&mem;
run;quit;
libname xel clear;
');
end;
end;
stop;
run;quit;
NAME=cars.xlsx
NOTE: Libref XEL was successfully assigned as follows:
Engine: EXCEL
Physical Name: d:/xls/cars.xlsx
NOTE: There were 428 observations read from the data set XEL.cars.
NOTE: The data set WORK.CARS has 428 observations and 15 variables.
NOTE: DATA statement used (Total process time):
real time 0.06 seconds
cpu time 0.07 seconds
NOTE: Libref XEL has been deassigned.
NAME=class.xlsx
NOTE: Libref XEL was successfully assigned as follows:
Engine: EXCEL
Physical Name: d:/xls/class.xlsx
NOTE: There were 19 observations read from the data set XEL.class.
NOTE: The data set WORK.CLASS has 19 observations and 5 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
NOTE: Libref XEL has been deassigned.
NAME=iris.xlsx
NOTE: Libref XEL was successfully assigned as follows:
Engine: EXCEL
Physical Name: d:/xls/iris.xlsx
NOTE: There were 150 observations read from the data set XEL.iris.
NOTE: The data set WORK.IRIS has 150 observations and 5 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds
NOTE: Libref XEL has been deassigned.
NOTE: DATA statement used (Total process time):
real time 0.55 seconds
cpu time 0.35 seconds
48 ! quit;
There's a solution I've been working on here, this was a question on here a few weeks ago (14 days to be exact).
https://github.com/statgeek/SAS-Tutorials/blob/master/Import_all_files_one_type
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.