DATA Step, Macro, Functions and more

Importing entire library of CSV datasets into SAS

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 17
Accepted Solution

Importing entire library of CSV datasets into SAS

Hello,

I have a library of around 50 CSV datasets. I would like to import them all in one step (if possible) rather than importing individual files one by one with proc import.  Can anyone help me with this?

 

PS: In other words, I am trying to avoid doing this 50 times:

PROC IMPORT OUT= WORK.dataset1
DATAFILE= "Path\dataset1.csv"
DBMS=CSV REPLACE;
GETNAMES=YES;
DATAROW=2;
RUN;

 

Thank you,

MK


Accepted Solutions
Solution
‎04-10-2017 09:35 AM
Respected Advisor
Posts: 4,173

Re: Importing entire library of CSV datasets into SAS

@MiraKr_

 

Here you go:

%let path=C:\temp;

filename filelist pipe "dir /b &path\*.csv";

data dirList;
  infile filelist truncover;
  input CSVfilename $100.;
  length SASTableName $32;
  SASTableName=substrn(scan(CSVfilename,1,'.'),1,31);

  /* valid SAS table name? */
  if not nvalid(SASTableName) then
    do;
      /* covert string to valid SAS table name */
      SASTableName=prxchange('s/^(\d)|\W+/_\1/oi',-1,strip(SASTableName));
    end;
run;


filename codegen temp;
data _null_;
/*  file print;*/
  file codegen;
  set dirList;
  put 
    / "PROC IMPORT OUT= WORK." SASTableName
    / "  DATAFILE= '&path\" CSVfilename +(-1) "'" 
    / "  DBMS=CSV REPLACE;"
    / "  GETNAMES=YES;"
    / "  DATAROW=2;" 
    / "  guessingrows=max;"
    / "RUN;"
    ;
run;

%include codegen / source2;

What's not covered in above code:

1. substring of 32 characters with all non-valid characters replaced with underscore could lead to duplicate table names (unlikely but possible)

2. Proc Import with DATAROW=2 will throw an error in case of an empty .csv source file

View solution in original post


All Replies
Super User
Super User
Posts: 7,042

Re: Importing entire library of CSV datasets into SAS

[ Edited ]

What do you mean by a library of CSV files?  If you have 50 CSV files that represent different sets of records of the same data then you should be able to read it with one data step.  If you have 50 individual files with their own unique structure then you will need to read them each separately.  It should be possible to make the computer generate the 50 PROC IMPORT steps for you.

 

Both of those situations are common questions on this forum.  

 

Occasional Contributor
Posts: 17

Re: Importing entire library of CSV datasets into SAS

Hi--,

I have 50 individual files with their own unique structure and names in one folder (please see attached photo). Each uque datasets have different variables in it. How can we set up the SAS code to repeat the import procedure 50 or so times? 

 

Regards,

MK


Capture1.JPG
Respected Advisor
Posts: 4,173

Re: Importing entire library of CSV datasets into SAS

[ Edited ]

Something like below should do.

%let path=C:\test;

Filename filelist pipe "dir /b &path\*.csv";

Data dirList;
  Infile filelist truncover;
  Input filename $100.;
  Put filename=;
Run;

filename codegen temp;
data _null_;
/*  file print;*/
  file codegen;
  set dirList;
  put 
    / "PROC IMPORT OUT= WORK.dataset_" _n_ z2.
    / "  DATAFILE= '&path\" filename "'" 
    / "  DBMS=CSV REPLACE;"
    / "  GETNAMES=YES;"
    / "  DATAROW=2;" 
    / "  guessingrows=max;"
    / "RUN;"
    ;
run;

%include codegen / source2;

You could also use CALL EXECUTE() instead of writing the code to a temporary file and then execute it via an %INCLUDE statement.

I personally prefer the approach using a temporary file as this allows me to create code which I can easily examine during development without execution by writing it to PRINT.

Occasional Contributor
Posts: 17

Re: Importing entire library of CSV datasets into SAS

[ Edited ]

Hi Patrick,

Thank you for the code! It works like magic. The only thing I would like to edit would be to assign the original CSV dataset names (listed in Dirlist) to the newly created SAS datasets, rather than creating SAS datasets with names dataset_01, dataset_02, etc, which is not really informative. I will try to work on that, but if you have an easy solution, please let me know. 

 

Thank you all for posting solutions.

 

MK

Solution
‎04-10-2017 09:35 AM
Respected Advisor
Posts: 4,173

Re: Importing entire library of CSV datasets into SAS

@MiraKr_

 

Here you go:

%let path=C:\temp;

filename filelist pipe "dir /b &path\*.csv";

data dirList;
  infile filelist truncover;
  input CSVfilename $100.;
  length SASTableName $32;
  SASTableName=substrn(scan(CSVfilename,1,'.'),1,31);

  /* valid SAS table name? */
  if not nvalid(SASTableName) then
    do;
      /* covert string to valid SAS table name */
      SASTableName=prxchange('s/^(\d)|\W+/_\1/oi',-1,strip(SASTableName));
    end;
run;


filename codegen temp;
data _null_;
/*  file print;*/
  file codegen;
  set dirList;
  put 
    / "PROC IMPORT OUT= WORK." SASTableName
    / "  DATAFILE= '&path\" CSVfilename +(-1) "'" 
    / "  DBMS=CSV REPLACE;"
    / "  GETNAMES=YES;"
    / "  DATAROW=2;" 
    / "  guessingrows=max;"
    / "RUN;"
    ;
run;

%include codegen / source2;

What's not covered in above code:

1. substring of 32 characters with all non-valid characters replaced with underscore could lead to duplicate table names (unlikely but possible)

2. Proc Import with DATAROW=2 will throw an error in case of an empty .csv source file

Occasional Contributor
Posts: 17

Re: Importing entire library of CSV datasets into SAS

This solution is very short and easy to run, it converted all 182 csv datasets into sas datasets. Thank you!

Valued Guide
Posts: 505

Re: Importing entire library of CSV datasets into SAS

Importing entire directory of XLS file into SAS

This requires full SAS so if yo get a error you may
want to contact your IT dept and show them the error.

This does not solve a CSV problem because it deals with
a directory of excel files. I tried to use 'proc import'
but it crashed SAS. DM commands were ignored.

You have much more flexibilty with excel files.

HAVE
====

 Directory of d:\xls

  cars.xlsx    04/04/2017  04:53 PM  49,376
  class.xlsx   04/04/2017  04:53 PM   9,253
  iris.xlsx    04/04/2017  04:53 PM  12,870

WANT
====

   WORK.CARS
   WORK.CLASS
   WORK.IRIS

WORKING CODE
============

        rc=filename("mydir","d:\xls");
        did=dopen("mydir");

        rc=dosubl('
           libname xel  "d:/xls/&mem..xlsx";
           data &mem;
             set xel.&mem;

*                _                  _       _
 _ __ ___   __ _| | _____        __| | __ _| |_ __ _
| '_ ` _ \ / _` | |/ / _ \_____ / _` |/ _` | __/ _` |
| | | | | | (_| |   <  __/_____| (_| | (_| | || (_| |
|_| |_| |_|\__,_|_|\_\___|      \__,_|\__,_|\__\__,_|

;


CREATE A DIRECTORY OF EXCEL FILES
=================================

dm "dexport sashelp.class 'd:\xls\class.xlsx' replace";
dm "dexport sashelp.cars  'd:\xls\cars.xlsx' replace";
dm "dexport sashelp.iris  'd:\xls\iris.xlsx' replace";


*          _       _   _
 ___  ___ | |_   _| |_(_) ___  _ __
/ __|/ _ \| | | | | __| |/ _ \| '_ \
\__ \ (_) | | |_| | |_| | (_) | | | |
|___/\___/|_|\__,_|\__|_|\___/|_| |_|

;

%symdel mem / nowarn;
data _null_;

   rc=filename("mydir","d:\xls");
   did=dopen("mydir");
   if did > 0 then do;
        memcount=dnum(did);
        do i=1 to memcount;
           name=dread(did,i);
           call symputx('mem',scan(name,1,'.'));
           put name=;
           rc=dosubl('
              libname xel  "d:/xls/&mem..xlsx";
              data &mem;
                set xel.&mem;
              run;quit;
              libname xel clear;
           ');
        end;
    end;
    stop;

run;quit;

NAME=cars.xlsx
NOTE: Libref XEL was successfully assigned as follows:
      Engine:        EXCEL
      Physical Name: d:/xls/cars.xlsx
NOTE: There were 428 observations read from the data set XEL.cars.
NOTE: The data set WORK.CARS has 428 observations and 15 variables.
NOTE: DATA statement used (Total process time):
      real time           0.06 seconds
      cpu time            0.07 seconds


NOTE: Libref XEL has been deassigned.
NAME=class.xlsx
NOTE: Libref XEL was successfully assigned as follows:
      Engine:        EXCEL
      Physical Name: d:/xls/class.xlsx
NOTE: There were 19 observations read from the data set XEL.class.
NOTE: The data set WORK.CLASS has 19 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds


NOTE: Libref XEL has been deassigned.
NAME=iris.xlsx
NOTE: Libref XEL was successfully assigned as follows:
      Engine:        EXCEL
      Physical Name: d:/xls/iris.xlsx
NOTE: There were 150 observations read from the data set XEL.iris.
NOTE: The data set WORK.IRIS has 150 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.02 seconds
      cpu time            0.01 seconds


NOTE: Libref XEL has been deassigned.
NOTE: DATA statement used (Total process time):
      real time           0.55 seconds
      cpu time            0.35 seconds

48  !     quit;

Occasional Contributor
Posts: 17

Re: Importing entire library of CSV datasets into SAS

Posted in reply to rogerjdeangelis
Thank you
Super User
Posts: 19,789

Re: Importing entire library of CSV datasets into SAS

There's a solution I've been working on here, this was a question on here a few weeks ago (14 days to be exact). 

 

https://github.com/statgeek/SAS-Tutorials/blob/master/Import_all_files_one_type

Occasional Contributor
Posts: 17

Re: Importing entire library of CSV datasets into SAS

Thank you!
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 218 views
  • 2 likes
  • 5 in conversation