DATA Step, Macro, Functions and more

Help merging multiple CSV files to a dataset.

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 5
Accepted Solution

Help merging multiple CSV files to a dataset.

I'm attempting to merge 160 CSV files into one dataset using macros. All of the files have similar filenames (BTLT4-#####-#) and 167 column variables. I tried to use the syntax listed below to do this, but I'm given the following error:

WARNING: Argument 1 to function DNUM referenced by the %SYSFUNC or %QSYSFUNC macro function is out of range. NOTE: Mathematical operations could not be performed during %SY

Any help with this will be appreciated!

Here is the syntax.

options mprint;

%macro MultImp(dir=,out=);

  %let rc=%sysfunc(filename(mydir,"&dir"));

  %let did=%sysfunc(dopen(mydir));

  %let lstname=;

  %let memcount=%sysfunc(dnum(&did));

  %if &memcount > 0 %then %do;mer

  %do i=1 %to &memcount;

  %let lstname=%sysfunc(dread(&did,&i));

  %let file=&dir.&lstname;

  PROC IMPORT DBMS=CSV OUT= _&i

            DATAFILE= "&file" REPLACE ;

     GETNAMES=YES;

     SCANTEXT=YES;

     USEDATE=YES;

     SCANTIME=YES;

     DBSASLABEL=NONE;

     TEXTSIZE=100;

  RUN;

  proc append data=_&i base=&out; run;

  proc delete data=_&i; run;

  %end;

  %let rc=%sysfunc(dclose(&did));

  %end;

  %mend MultImp;

  %MultImp(dir=C:\Users\Owner\Desktop\SAS Project\ACL CSV files,out=File);


Accepted Solutions
Solution
‎02-25-2014 07:17 PM
Super User
Super User
Posts: 6,499

Re: Help merging multiple CSV files to a dataset.

I really do not recommend running PROC IMPORT on a series of CSV files because the results are too unpredictable.  The length and even the type of the resulting variable can be very different depending on the actual values that happen to be in the particular file.

But here is a simple macro to use the output of the DIR/B command to generate multiple PROC IMPORT/PROC APPEND calls.

%macro multimp(dir=,out=);

* Make sure output ds does not exist ;

proc delete data=&out; run;

* Read list of filenames and generate PROC IMPORT and PROC APPEND for each one ;

filename code temp ;

data _null_ ;

  infile "dir ""&dir\btlt4-*.csv"" /b" pipe truncover;

  input filename $256.;

  file code ;

  put 'proc import datafile="&dir\' filename +(-1) '" out=onefile replace;'

    / 'run;'

    / 'proc append data=onefile base=&out force; run;'

  ;

run;


* Run the generated code ;

%inc code / source2 ;

%mend multimp ;


View solution in original post


All Replies
Respected Advisor
Posts: 3,777

Re: Help merging multiple CSV files to a dataset.

Perhaps if your CSVs are all the same you can read them using a file reference with wildcard.

filename csv160 'path/BTLT4-*.csv';

Respected Advisor
Posts: 3,777

Re: Help merging multiple CSV files to a dataset.

Do you need those double quotes?

%let rc=%sysfunc(filename(mydir,"&dir"));

Super User
Posts: 5,081

Re: Help merging multiple CSV files to a dataset.

Have you tried examining &DID to see whether you have the proper syntax for identifying the folder?

Right after creating it would be the right place to stick a %PUT statement.  Probably a value of 0 indicates the folder was not properly identified, but the documentation on DOPEN would confirm that.

Occasional Contributor
Posts: 5

Re: Help merging multiple CSV files to a dataset.

After including %put &did; into my syntax, I was indeed given a value of 0. What can I do to fix this?

- SMR

As a reminder, this is the error I receive for the syntax above.

WARNING: Argument 1 to function DNUM referenced by the %SYSFUNC or %QSYSFUNC macro function is out

         of range.

NOTE: Mathematical operations could not be performed during %SYSFUNC function execution. The

      result of the operations have been set to a missing value.

0

Super User
Posts: 5,081

Re: Help merging multiple CSV files to a dataset.

To figure that out, I would have to read the documentation on DOPEN, and fiddle with it till I found what DOPEN expects.  I would submit that you should do that part.  Instead of MYDIR, figure out what DOPEN expects in this statement:

%let did = %sysfunc(dopen(mydir));

%put DID is &did;

Once you find a value of &DID that is nonzero, the rest will be much easier.

You won't need a full-blown macro.  Playing with just those two lines of code would be enough.

Super User
Super User
Posts: 6,499

Re: Help merging multiple CSV files to a dataset.

PROC IMPORT is NOT what you want to use if you have 160 files all with same 167 variables.

You might use it once and recall the code it generates to help you construct the code.

data want ;

* Define the 167 variables - You can use real names I will use var1-var167 as an example ;

   length var1 var2 ..... var167 ;

* Add some INFORMAT and FORMAT statement for variables that need them, i.e. DATEs and TIMEs ;

* You do NOT want to attach either INFORMATS or FORMATS to simple strings and numbers;

   infile "&mydir/BTLT4-*.csv"  dsd truncover lrecl=32000 ;

* Throw away the header lines;

   input @;

   if upcase(_infile_)=:'VAR1,' then delete;

* Read the data ;

   input var1 --- var167 ;

run;

Respected Advisor
Posts: 3,777

Re: Help merging multiple CSV files to a dataset.

I think it would be better to use the INFILE statement option EOV= that is intended for the purpose of detecteing when a new file has been opened than to use rely the value of a data line.  The OP might also like to know the name of each file provided by the FILENAME= option.

Super User
Super User
Posts: 7,401

Re: Help merging multiple CSV files to a dataset.

Seonding Kurts answer here.  If they are all the same type of file then "copy *.csv all_csv.csv" in CMD is probably easiest. 

Respected Advisor
Posts: 3,777

Re: Help merging multiple CSV files to a dataset.

I don't think that is a good idea at all.  SAS will not open all the files and once and if we do as you suggest we won't be able to tell when a new file is opened EOV= or know the name of the file FILENAME=.

Super User
Super User
Posts: 7,401

Re: Help merging multiple CSV files to a dataset.

Not sure I follow you.  I am suggesting appending the text CSV files outside of SAS into one big file.  Then in SAS importing that one file to create the final dataset.  This as an alternative to importing into SAS many files, then appending them.  If you need to do it several times, save the dos commands in a batch file.  Then each time, copy over all your .CSV files.  Run the batch file to create one big .CSV, and then run your SAS program to import that one complete file.  The SAS code is a very simple proc import then.

Super User
Posts: 17,819

Re: Help merging multiple CSV files to a dataset.

What about the header lines in each CSV file?

Respected Advisor
Posts: 3,777

Re: Help merging multiple CSV files to a dataset.

RW9 wrote:

Not sure I follow you.  I am suggesting appending the text CSV files outside of SAS into one big file.

And I am suggesting that you NOT do that because SAS has better tools to deal with reading concatenate files.  Look at my first reply where I suggest reading the files using a wildcard then there is a reply from Tom where he supplies a more complete example.

Occasional Contributor
Posts: 5

Re: Help merging multiple CSV files to a dataset.

After replacing the PROC IMPORT statement to the syntax below I'm still getting the following error.

dir=C:\Users\Owner\Desktop\SAS Project\ACL CSV files\

did=0

WARNING: Argument 1 to function DNUM referenced by the %SYSFUNC or %QSYSFUNC macro function is out of range.

NOTE: Mathematical operations could not be performed during %SYSF

BTW, thanks for your help thus far! As a beginner with SAS macro all help is appreciated.

- SMR

options mprint;

%macro MultImp(dir=,out=);

  %let rc=%sysfunc(filename(mydir,&dir));

  %let did=%sysfunc(dopen(mydir));

%put dir=&dir;

%put did=&did;

  %let filrf=MYDIR;

  %let lstname=;

  %let memcount=%sysfunc(dnum(&did));

  %if &memcount > 0 %then %do;

  %do i=1 %to &memcount;

  %let lstname=%sysfunc(dread(&did,&i));

  %let file=&dir.&lstname;

  data want;

   length var1 var2 ..... var167 ;

   infile "&mydir\BTLT4-*.csv"  dsd truncover lrecl=32000;

   input @;

   if upcase(_infile_)=:'VAR1,' then delete;

   input var1 --- var167 ;

  run;

  proc append data=_&i base=&out; run;

  proc delete data=_&i; run;

  %end;

  %let rc=%sysfunc(dclose(&did));

  %end;

  %mend MultImp;

  %MultImp(dir=C:\Users\Owner\Desktop\SAS Project\ACL CSV files\,out=File);

Valued Guide
Posts: 2,175

Re: Help merging multiple CSV files to a dataset.

proposed a simple alternative to your macro

data want( compress= yes);

   length var1 var2 ..... var167 $20 filen filename $100;

   infile "&mydir\BTLT4-*.csv"  dsd truncover lrecl=32000 eov=eov filename= filen firstobs=2;

   input @;

  filename= filen ;

   if eov then delete;

   input var1 --- var167 ;

  run;

of course I'm not so I have probably got it "not quite right", but this is the size of the solution (subject to proper definition of 167 lengths and/or informats)

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 22 replies
  • 4628 views
  • 2 likes
  • 9 in conversation