Importing several files into SAS

Accepted Solution Solved
Reply
Regular Contributor
Posts: 180
Accepted Solution

Importing several files into SAS

Hello all,

 

I have a folder containing several CSV files. I wish to write a small SAS program (based on macro) that reads all the files into SAS datasets. My input is a folder with CSV files, my output should be the same set of SAS datasets, stored in a library called, let's say c, or any other name (not in WORK as I want them permanent).

 

I have tried something but stuck in the middle. First I read all the names of the files in the folder into a datset, which is the list of all file names. Then I create macro variables giving each file an index and another macro variable counting the total number of files. Now I need to write a small macro that will use proc import to import each file into SAS and store it in a dataset. I have a bit of trouble defining the different names (as it is fairly easy to run in a loop and just import a file).

 

My code so far is:

 

 

%let path=C:\My_SAS_PROJECTS;
filename folder "&path\Data.";
 
/* Making a list of all files in the folder */
data FilesInFolder;
   length Line 8 File $300;
   List = dopen('FilesInFolder');
   do Line = 1 to dnum(List);
        File = trim(dread(List,Line));
        output;
   end;
   drop list line;
run;
 
/* Creating local macro variables */
data _NULL_;
     set FilesInFolder end=final;
     call symput(compress('File'||_N_),trim(File));
     if final then call symput(trim('Total'),_N_);
run;
  
/* This macro should import all files specified in the list and save them as datasets */
%macro loop;
 
%do i = 1 %to &Total;
 
proc import datafile="&path\Data\&&File&i"
     out=&name&i.
     dbms=csv
     replace;
     getnames=no;
run;
.
.
.
. 
 
%end;
 %mend loop;
 
%loop
 

 

 

Can you assist me completing the macro function ?

 

Thank you !


Accepted Solutions
Solution
‎01-06-2016 03:48 AM
Trusted Advisor
Posts: 1,115

Re: Importing several files into SAS

I've made a few modifications to your code. If the names of your CSV files are valid SAS dataset names, you can omit the VALIDMEMNAME option and the NLITERAL function. I have tested the code with the following "ugly" file names:

  • 3abc.csv
  • de$fg.txt.csv
  • This is a very long file name with more than 32 characters.csv
%let path=C:\My_SAS_PROJECTS;
filename folder "&path\Data.";
libname c 'C:\Temp';         /* just an example destination folder */
options validmemname=extend; /* to allow non-standard dataset names */
 
/* Making a list of all files in the folder */
data FilesInFolder;
   length Line 8 File $300;
   List = dopen('folder');  /* corrected the function argument */
   do Line = 1 to dnum(List);
        File = trim(dread(List,Line));
        output;
   end;
   drop list line;
run;
 
/* Creating global macro variables */  /* not "local" */
data _NULL_;
     set FilesInFolder end=final;
     call symput(cats('File', _N_), trim(File));     /* used CATS instead of COMPRESS (...||...) */
     call symput(cats('Name', _N_), trim(nliteral(substr(File,1,min(32, length(File)-4))))); /* inserted */
     if final then call symputx(trim('Total'), _N_); /* replaced symput by symputx */
run;


/* This macro should import all files specified in the list and save them as datasets */
%macro loop;
%do i = 1 %to &Total;
  proc import datafile="&path\Data\&&File&i"
       out=c.&&name&i  /* adapted */
       dbms=csv
       replace;
       getnames=no;
  run;
%end;
%mend loop;
 
%loop

If non-standard dataset names have been created, you can use them as in this example:

proc print data=c.'de$fg.txt'n;
run;

 

 

 

View solution in original post


All Replies
SAS Super FREQ
Posts: 703

Re: Importing several files into SAS

hi

 

@ChrisHemedinger has written a great blog entry on this subject, how to loop over a list of values and run some SAS code, see

http://blogs.sas.com/content/sasdummy/2012/03/20/sas-program-by-processing/ and http://blogs.sas.com/content/sasdummy/2012/03/23/improving-on-a-sas-programming-pattern/

 

It has great explanation on how everything works.

 

Bruno

Solution
‎01-06-2016 03:48 AM
Trusted Advisor
Posts: 1,115

Re: Importing several files into SAS

I've made a few modifications to your code. If the names of your CSV files are valid SAS dataset names, you can omit the VALIDMEMNAME option and the NLITERAL function. I have tested the code with the following "ugly" file names:

  • 3abc.csv
  • de$fg.txt.csv
  • This is a very long file name with more than 32 characters.csv
%let path=C:\My_SAS_PROJECTS;
filename folder "&path\Data.";
libname c 'C:\Temp';         /* just an example destination folder */
options validmemname=extend; /* to allow non-standard dataset names */
 
/* Making a list of all files in the folder */
data FilesInFolder;
   length Line 8 File $300;
   List = dopen('folder');  /* corrected the function argument */
   do Line = 1 to dnum(List);
        File = trim(dread(List,Line));
        output;
   end;
   drop list line;
run;
 
/* Creating global macro variables */  /* not "local" */
data _NULL_;
     set FilesInFolder end=final;
     call symput(cats('File', _N_), trim(File));     /* used CATS instead of COMPRESS (...||...) */
     call symput(cats('Name', _N_), trim(nliteral(substr(File,1,min(32, length(File)-4))))); /* inserted */
     if final then call symputx(trim('Total'), _N_); /* replaced symput by symputx */
run;


/* This macro should import all files specified in the list and save them as datasets */
%macro loop;
%do i = 1 %to &Total;
  proc import datafile="&path\Data\&&File&i"
       out=c.&&name&i  /* adapted */
       dbms=csv
       replace;
       getnames=no;
  run;
%end;
%mend loop;
 
%loop

If non-standard dataset names have been created, you can use them as in this example:

proc print data=c.'de$fg.txt'n;
run;

 

 

 

Regular Contributor
Posts: 180

Re: Importing several files into SAS

Thank you for your tips.

 

I have a few of questions:

 

1. Can you kindly explain the line you added with the nliteral command, what is it for ?

 

2. If I want that the first row in each file will be the variable names, should I change the getnames to yes ?

 

3. If I wanted to change this code, so instead of csv files, it will do the same with XLS files (Excel 97-2003), what needs to be modified ?

 

Thank you very much, big help.

Trusted Advisor
Posts: 1,115

Re: Importing several files into SAS

You're welcome. Here are the answers to your questions:

 

  1. I inserted that line in order to prepare names for the output datasets. This could be simplified if you were happy with dataset names such as DS1, DS2, etc. In this case one could write the original CSV file names into dataset labels in order to provide the link between the dataset and the corresponding raw data file.

    My suggestion aimed at taking the CSV file names as dataset names (i.e. abc.csv --> abc.sas7bdat), as far as possible, so that it is obvious which raw data file belongs to a given dataset. But there is the potential issue that some of the CSV file names do not comply with the rules for valid SAS dataset names. Problematic cases would include names which start with a digit, contain special characters other than the underscore or are longer than 32 characters.

    The expression substr(File,1,min(32, length(File)-4)) results in the CSV file name without the ".CSV" suffix, truncated to 32 characters, if the original name was longer. The NLITERAL function converts this string to a so called SAS name literal. Examples: NLITERAL('abc')='abc' (i.e. valid SAS names are left unchanged), NLITERAL('ab$c')='"ab$c"N' (in general, the dollar sign is not allowed in a dataset name, but in the special syntax "..."N, it is acceptable). If datasets with non-standard names are created, the option setting VALIDMEMNAME=extend is necessary to let SAS accept them. Honestly, I have never used such non-standard dataset names in practice and I would rather recommend to avoid them. So, in your situation I would actually look through the list of CSV file names and see how non-compliant names (if any) could be transformed into valid SAS names. For example, if blanks were the only special characters in the names, I would simply replace them with underscores in the dataset names.

  2. Yes, if the first line in each CSV file contains the variable names, GETNAMES=yes would instruct SAS to name the variables correspondingly. SAS would automatically modify the names appropriately if they were not valid variable names (depending on the setting of system option VALIDVARNAME).

  3. If you have licensed SAS/ACCESS Interface to PC File Formats, you can specify dbms=excel (or dbms=xls) in the PROC IMPORT step to import Excel 97-2003 files. In addition, there are SHEET= and RANGE= options to specify the spreadsheet (if there are more than one) and a range of cells (if only part of the spreadsheet is to be imported). Alternatively, you could use a LIBNAME statement of the form libname myxls excel path="&path\Data\test.xls" to access an Excel workbook as if it was a SAS library. For more details please see
    the documentation: http://support.sas.com/documentation/cdl/en/acpcref/67382/PDF/default/acpcref.pdf
    or some of the papers on this subject, e.g. http://support.sas.com/resources/papers/proceedings10/144-2010.pdf.

     

    I don't have that SAS/ACCESS license (and I don't have Excel installed on my SAS workstation), so I can't test this. When I imported Excel sheets in the past (primarily with SAS 6 and SAS 8), I mostly used DDE (dynamic data exchange), which is an "old", partly outdated technology, but very flexible. For example, I was able to let SAS detect the text color used in an Excel sheet (which carried some information), but that was extremely complicated.

Regular Contributor
Posts: 180

Re: Importing several files into SAS

Thank you, everything makes sense now.
Your "old" code sounds complicated indeed ! :-)
Super User
Posts: 11,101

Re: Importing several files into SAS

[ Edited ]

One of many cautions with using Proc Import and reading variable names from column headers: The discussion that mentions for data set names also appies to variables with the added complication that if you have two or more column headings that start with the same text that proc import may truncate the first occurance and the next could get a very generic varaible name of VARxx where xx refers to the column number.

 

Example:

Column 10: This is a very long column header related to the annual values of Product X

Column 11: This is a very long column header related to the annual values of Product Y

 

With maximum variable name length of 32 characters these tow column headers "look the same" to the Proc Import processor.

Super Contributor
Posts: 428

Re: Importing several files into SAS

Could you please tell me why we need two ampersand before file and name macro in following proc import?

 

  proc import datafile="&path\Data\&&File&i"
       out=c.&&name&i  /* adapted */
       dbms=csv
       replace;
       getnames=no;
  run;
Trusted Advisor
Posts: 1,115

Re: Importing several files into SAS

@Babloo: The macro processor resolves expressions like &&name&i in two passes:

  1. && resolves to & and &i is resolved to the content of macro variable i, e.g., to 1 in the first iteration of the "%do i=1 %to ..." loop. Result (in the first iteration of the %DO loop): &name1
  2. &name1 is resolved to the content of macro variable name1.

So, whenever you loop through a list of numbered macro variables with a %DO loop you will probably use this pattern.

 

If the expression was written as &name&i, the macro processor would try to resolve &name to the content of a macro variable name in the first pass, but this is not what we want in the above situation. There is no macro variable name, hence we would get a warning "WARNING: Apparent symbolic reference NAME not resolved."

Super User
Posts: 18,984

Re: Importing several files into SAS

You may want to consider using Call Execute and some macro parameters in your macro instead. This way you can create all the parameters in a data set and then use that data set to drive your macro.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 4000 views
  • 5 likes
  • 6 in conversation