BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
MiraKr_
Obsidian | Level 7

Hello,

I have a library of around 50 CSV datasets. I would like to import them all in one step (if possible) rather than importing individual files one by one with proc import.  Can anyone help me with this?

 

PS: In other words, I am trying to avoid doing this 50 times:

PROC IMPORT OUT= WORK.dataset1
DATAFILE= "Path\dataset1.csv"
DBMS=CSV REPLACE;
GETNAMES=YES;
DATAROW=2;
RUN;

 

Thank you,

MK

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

@MiraKr_

 

Here you go:

%let path=C:\temp;

filename filelist pipe "dir /b &path\*.csv";

data dirList;
  infile filelist truncover;
  input CSVfilename $100.;
  length SASTableName $32;
  SASTableName=substrn(scan(CSVfilename,1,'.'),1,31);

  /* valid SAS table name? */
  if not nvalid(SASTableName) then
    do;
      /* covert string to valid SAS table name */
      SASTableName=prxchange('s/^(\d)|\W+/_\1/oi',-1,strip(SASTableName));
    end;
run;


filename codegen temp;
data _null_;
/*  file print;*/
  file codegen;
  set dirList;
  put 
    / "PROC IMPORT OUT= WORK." SASTableName
    / "  DATAFILE= '&path\" CSVfilename +(-1) "'" 
    / "  DBMS=CSV REPLACE;"
    / "  GETNAMES=YES;"
    / "  DATAROW=2;" 
    / "  guessingrows=max;"
    / "RUN;"
    ;
run;

%include codegen / source2;

What's not covered in above code:

1. substring of 32 characters with all non-valid characters replaced with underscore could lead to duplicate table names (unlikely but possible)

2. Proc Import with DATAROW=2 will throw an error in case of an empty .csv source file

View solution in original post

10 REPLIES 10
Tom
Super User Tom
Super User

What do you mean by a library of CSV files?  If you have 50 CSV files that represent different sets of records of the same data then you should be able to read it with one data step.  If you have 50 individual files with their own unique structure then you will need to read them each separately.  It should be possible to make the computer generate the 50 PROC IMPORT steps for you.

 

Both of those situations are common questions on this forum.  

 

MiraKr_
Obsidian | Level 7

Hi--,

I have 50 individual files with their own unique structure and names in one folder (please see attached photo). Each uque datasets have different variables in it. How can we set up the SAS code to repeat the import procedure 50 or so times? 

 

Regards,

MK


Capture1.JPG
Patrick
Opal | Level 21

Something like below should do.

%let path=C:\test;

Filename filelist pipe "dir /b &path\*.csv";

Data dirList;
  Infile filelist truncover;
  Input filename $100.;
  Put filename=;
Run;

filename codegen temp;
data _null_;
/*  file print;*/
  file codegen;
  set dirList;
  put 
    / "PROC IMPORT OUT= WORK.dataset_" _n_ z2.
    / "  DATAFILE= '&path\" filename "'" 
    / "  DBMS=CSV REPLACE;"
    / "  GETNAMES=YES;"
    / "  DATAROW=2;" 
    / "  guessingrows=max;"
    / "RUN;"
    ;
run;

%include codegen / source2;

You could also use CALL EXECUTE() instead of writing the code to a temporary file and then execute it via an %INCLUDE statement.

I personally prefer the approach using a temporary file as this allows me to create code which I can easily examine during development without execution by writing it to PRINT.

MiraKr_
Obsidian | Level 7

Hi Patrick,

Thank you for the code! It works like magic. The only thing I would like to edit would be to assign the original CSV dataset names (listed in Dirlist) to the newly created SAS datasets, rather than creating SAS datasets with names dataset_01, dataset_02, etc, which is not really informative. I will try to work on that, but if you have an easy solution, please let me know. 

 

Thank you all for posting solutions.

 

MK

Patrick
Opal | Level 21

@MiraKr_

 

Here you go:

%let path=C:\temp;

filename filelist pipe "dir /b &path\*.csv";

data dirList;
  infile filelist truncover;
  input CSVfilename $100.;
  length SASTableName $32;
  SASTableName=substrn(scan(CSVfilename,1,'.'),1,31);

  /* valid SAS table name? */
  if not nvalid(SASTableName) then
    do;
      /* covert string to valid SAS table name */
      SASTableName=prxchange('s/^(\d)|\W+/_\1/oi',-1,strip(SASTableName));
    end;
run;


filename codegen temp;
data _null_;
/*  file print;*/
  file codegen;
  set dirList;
  put 
    / "PROC IMPORT OUT= WORK." SASTableName
    / "  DATAFILE= '&path\" CSVfilename +(-1) "'" 
    / "  DBMS=CSV REPLACE;"
    / "  GETNAMES=YES;"
    / "  DATAROW=2;" 
    / "  guessingrows=max;"
    / "RUN;"
    ;
run;

%include codegen / source2;

What's not covered in above code:

1. substring of 32 characters with all non-valid characters replaced with underscore could lead to duplicate table names (unlikely but possible)

2. Proc Import with DATAROW=2 will throw an error in case of an empty .csv source file

MiraKr_
Obsidian | Level 7

This solution is very short and easy to run, it converted all 182 csv datasets into sas datasets. Thank you!

rogerjdeangelis
Barite | Level 11
Importing entire directory of XLS file into SAS

This requires full SAS so if yo get a error you may
want to contact your IT dept and show them the error.

This does not solve a CSV problem because it deals with
a directory of excel files. I tried to use 'proc import'
but it crashed SAS. DM commands were ignored.

You have much more flexibilty with excel files.

HAVE
====

 Directory of d:\xls

  cars.xlsx    04/04/2017  04:53 PM  49,376
  class.xlsx   04/04/2017  04:53 PM   9,253
  iris.xlsx    04/04/2017  04:53 PM  12,870

WANT
====

   WORK.CARS
   WORK.CLASS
   WORK.IRIS

WORKING CODE
============

        rc=filename("mydir","d:\xls");
        did=dopen("mydir");

        rc=dosubl('
           libname xel  "d:/xls/&mem..xlsx";
           data &mem;
             set xel.&mem;

*                _                  _       _
 _ __ ___   __ _| | _____        __| | __ _| |_ __ _
| '_ ` _ \ / _` | |/ / _ \_____ / _` |/ _` | __/ _` |
| | | | | | (_| |   <  __/_____| (_| | (_| | || (_| |
|_| |_| |_|\__,_|_|\_\___|      \__,_|\__,_|\__\__,_|

;


CREATE A DIRECTORY OF EXCEL FILES
=================================

dm "dexport sashelp.class 'd:\xls\class.xlsx' replace";
dm "dexport sashelp.cars  'd:\xls\cars.xlsx' replace";
dm "dexport sashelp.iris  'd:\xls\iris.xlsx' replace";


*          _       _   _
 ___  ___ | |_   _| |_(_) ___  _ __
/ __|/ _ \| | | | | __| |/ _ \| '_ \
\__ \ (_) | | |_| | |_| | (_) | | | |
|___/\___/|_|\__,_|\__|_|\___/|_| |_|

;

%symdel mem / nowarn;
data _null_;

   rc=filename("mydir","d:\xls");
   did=dopen("mydir");
   if did > 0 then do;
        memcount=dnum(did);
        do i=1 to memcount;
           name=dread(did,i);
           call symputx('mem',scan(name,1,'.'));
           put name=;
           rc=dosubl('
              libname xel  "d:/xls/&mem..xlsx";
              data &mem;
                set xel.&mem;
              run;quit;
              libname xel clear;
           ');
        end;
    end;
    stop;

run;quit;

NAME=cars.xlsx
NOTE: Libref XEL was successfully assigned as follows:
      Engine:        EXCEL
      Physical Name: d:/xls/cars.xlsx
NOTE: There were 428 observations read from the data set XEL.cars.
NOTE: The data set WORK.CARS has 428 observations and 15 variables.
NOTE: DATA statement used (Total process time):
      real time           0.06 seconds
      cpu time            0.07 seconds


NOTE: Libref XEL has been deassigned.
NAME=class.xlsx
NOTE: Libref XEL was successfully assigned as follows:
      Engine:        EXCEL
      Physical Name: d:/xls/class.xlsx
NOTE: There were 19 observations read from the data set XEL.class.
NOTE: The data set WORK.CLASS has 19 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds


NOTE: Libref XEL has been deassigned.
NAME=iris.xlsx
NOTE: Libref XEL was successfully assigned as follows:
      Engine:        EXCEL
      Physical Name: d:/xls/iris.xlsx
NOTE: There were 150 observations read from the data set XEL.iris.
NOTE: The data set WORK.IRIS has 150 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.02 seconds
      cpu time            0.01 seconds


NOTE: Libref XEL has been deassigned.
NOTE: DATA statement used (Total process time):
      real time           0.55 seconds
      cpu time            0.35 seconds

48  !     quit;

Reeza
Super User

There's a solution I've been working on here, this was a question on here a few weeks ago (14 days to be exact). 

 

https://github.com/statgeek/SAS-Tutorials/blob/master/Import_all_files_one_type

MiraKr_
Obsidian | Level 7
Thank you!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 7376 views
  • 3 likes
  • 5 in conversation