DATA Step, Macro, Functions and more

Reading multiple csv files

Reply
Occasional Contributor
Posts: 8

Reading multiple csv files

I have about 250 csv files.  They are named abc20160104 abc20160105 abc20160106 and so forth

 

for each individual file I have written the following code:

 

data want;

Infile 'C:\Temp\abc20160104.csv' DLM = ','
Lrecl = 32000 DSD Truncover firstobs = 1;
Informat fname $60. ticker $5. ;
Input
fname ticker VarA VarB VarC VarD VarE VarF /*These are not the names of the actual Variables but just examples */;
run;

Data Want;
set want;
String = scan ( fname , -1 , , 'a' ); 

Date = input (string , yymmdd8.);
format date mmddyy10.;
run;

data want (keep = date ticker VarA VarB VarC); set want;
run;

data want ; set want;

format VarM 12. ;

VarM = (VarB+VarC)/200;

run;

/* And so on -- there are a few more lines */

After the code for each individual file I want the output to be saved as 20160104  20160105 or the file name which was read in;

          Thanx in advance.

   Randy

Super User
Posts: 17,819

Re: Reading multiple csv files

[ Edited ]
Trusted Advisor
Posts: 1,128

Re: Reading multiple csv files

[ Edited ]

Try the call execute like below this is an untested code. The want dataset will have the date value in new variable

 

data want;
do i = 1 to 250;
new=intnx('day',input('20160104',yymmdd8.),i-1);
output;
end;
format new yymmddn8.;
run;

data _null_;
set want;
call execute("data want;
Infile 'C:\Temp\abc"||new||".csv' DLM = ','
Lrecl = 32000 DSD Truncover firstobs = 1;
Informat fname $60. ticker $5. ;
Input
fname ticker VarA VarB VarC VarD VarE VarF /*These are not the names of the actual Variables but just examples */;
run;

Data Want;
set want;
String = scan ( fname , -1 , , 'a' ); 

Date = input (string , yymmdd8.);
format date mmddyy10.;
run;

data want (keep = date ticker VarA VarB VarC); set want;
run;

data want"||new||" ; set want;
format VarM 12. ;
VarM = (VarB+VarC)/200;
run;");

run;
Thanks,
Jag
Super User
Super User
Posts: 6,500

Re: Reading multiple csv files

What is the question?

Also you seem to keep using multiple data steps to perform things that can be done in a single step.  Is there a reason for that?  Did you leave out other lines of code that would make it impossible to combine those steps?

Can you just read them all into the same dataset?

data want;
  infile 'C:\Temp\abc*.csv' DSD DLM = ',' lrecl = 32000 truncover;
  input @;
  if upcase(_infile_) =: 'FNAME,' then input;
  length fname $60 ticker $5 ;
  input fname ticker VarA VarB VarC VarD VarE VarF
 /*These are not the names of the actual Variables but just examples */
  ;
  length string $60 ;
  string = scan ( fname , -1 , , 'a' ); 
  drop string ;
  date = input (string , yymmdd8.);
  format date mmddyy10.;
  VarM = (VarB+VarC)/200;
  format VarM 12. ;
run;
Trusted Advisor
Posts: 1,374

Re: Reading multiple csv files

Few notes:

 

1) All your steps can be done in one step:

data want;
       Infile 'C:\Temp\abc20160104.csv' DLM = ','
               Lrecl = 32000 DSD Truncover firstobs = 1;
      Informat fname $60. ticker $5. ;
      Input  fname ticker VarA VarB VarC VarD VarE VarF ;

      String = scan ( fname , -1 , , 'a' ); 
      Date = input (string , yymmdd8.);
      VarM = (VarB+VarC)/200;
      format date mmddyy10.  VarM 12.;
      keep date ticker VarA VarB VarC;
run;

In order to have datasets named as desired and 

assuming all csv files are of same format, i.e. same order of variables and same length,

you can convert the code into macro, supplying the date part of the file name as argument,

then run the macro - either a row per name or within a loop creating dates from first to last dates given,

something like:

 

%macro single(datex);  /* datex given as yyyymmdd format */
    data f&datex;
       Infile "C:\Temp\abc&datex..csv' DLM = ','
               Lrecl = 32000 DSD Truncover firstobs = 1;
      Informat fname $60. ticker $5. ;
      Input  fname ticker VarA VarB VarC VarD VarE VarF ;

      String = scan ( fname , -1 , , 'a' ); 
      Date = input (string , yymmdd8.);
      VarM = (VarB+VarC)/200;
      format date mmddyy10.  VarM 12.;
      keep date ticker VarA VarB VarC;
   run;
%mend single;

%let fromdate = 20160101;  /* adapt period dates */
%let uptodate = 20161201;
data _null_;
     fromdate = input("&fromdate",yymmddn8.);
     updodate = input("&uptodate",yymmddn8.);

    do i=fromdate to uptodate;
         datex = put(i, yymmdd8.);
         line = '%single(' || datex|| ');'
         call execute(line);
    end;
run;
         

I haven't checked it, so maybe you'll need  recombine the line= and call execute statements;

Ask a Question
Discussion stats
  • 4 replies
  • 129 views
  • 0 likes
  • 5 in conversation