BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Nieves
Quartz | Level 8

I want to loop through every CSV file in a folder, delete the first two line of each CSV file, and save each CSV file as a SAS data in SAS 9.4. The CSV file name is indexed using number and in the format of 1_keyword1_keyword2, 2_keyword3_keyword4,...

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You were almost there. You just need to add the logic to skip the lines to the code in the IF EOV block.

Looks like you data files also include information on the first line that you could read.

/*
Read all CSV files into single dataset.

Example file format:
   ----+----0----+----0
1) Category: Finance
2) 
3) Month,EFFECTIVE: (California),loss: (California)
4) 2004-01,18,47
5) 2004-02,10,41

*/
%let path= D:\CA_California ;

data import_all;
  length filename fname $256 category $40 month 8 val1 val2 8 ;
  infile "&path\*.csv" filename=fname truncover dsd dlm=',' eov=eov end=eof ;
  input @;
  filename = scan(fname,-1,'/\');
  if _n_ eq 1 or eov then do;
    input @11 category // ;
    retain category ;
  end;
  input month :anydtdte. val1 val2 ;
  format month yymm7.;
  eov=0;
run;
proc print; 
  where month between '01JAN2012'd and '01MAR2012'd ;
run;

Capture.PNG

 

View solution in original post

10 REPLIES 10
Reeza
Super User

Do they all have the same format in the CSV? Do you have code to read one file?

What's your OS?

 

When you say ignore the first two lines, do the text files need to be modified or can you just read the CSV using the FIRSTOBS as starting in row 3?

 

Can you post a sample CSV? 

 

Here's one way you can do this, if they're all in the same folder and you can have them as one data set that's actually the easiest way to read them all in at once.

https://communities.sas.com/t5/SAS-Communities-Library/How-do-I-write-a-macro-to-import-multiple-tex...

Nieves
Quartz | Level 8

Hi Reeza,

 

Thank you very much for your reply. All files have the same format in the CSV and I have tried your code to run. However, since the first observation is some unrelevant word in the first line (Category: Finance), so I could not use the first observation to read line three. I am new to SAS and not sure how to use FIRSTOBS to start from line 3. The sample CSV is as follows:

CSV example.JPG

 

Therefore, when using your code the output is reading the file name for many times, but not the content of the file as follows:

SAS data example.JPG

Appreicate if you might advice on how to loop and read and save all files into one SAS data? 

Reeza
Super User

Post what your code is so far and attach a sample CSV file.

I can't write code from pictures and won't type out your data.

 

Look at the INFILE statement and FIRSTOBS and Example 5 which also illustrates how to read from multiple input files.

http://support.sas.com/documentation/cdl/en/lestmtsref/69738/HTML/default/viewer.htm#n1rill4udj0tfun...

 

And here's a basic tutorial on reading text files:

https://stats.idre.ucla.edu/sas/modules/inputting-data-into-sas/

Nieves
Quartz | Level 8

Hi Reeza,

 

Sure. My code is as follows:

 

data import_all;
length filename txt_file_name $256;
retain txt_file_name;
infile "D:\CA_California\*.csv" eov=eov filename=filename truncover DELIMITER='09'x;
input@;

if _n_ eq 1 or eov then do;
txt_file_name = scan(filename, -1, "\");
eov=0;
end;
else input filename $ month $ keyword1 $ keyword2
;
RUN;

 

For each CSV file, there are two key words in the name: 1_CA_keyword1_keyword2, which correspond to the two column name in the file. The key word changes in each file, which is problematic to code. May I ask how to input the first column in each file under keyword1 and second column under keyword2 ? 

 

I have attached the first 3 CSV file in the folder. 

 

The final SAS data is ideally in the format as follows:

 

txt_file_name  keyword1 keyword2

file name 1

file name 2

 

Thanks for the help. 

Patrick
Opal | Level 21

@Nieves

"May I ask how to input the first column in each file under keyword1 and second column under keyword2 ? "

I'm not really sure if I understand your question so just giving an answer based on assumptions and then providing code which reads your data.

 

Variable name (column names) are fix and you can't change them "on the go". You could read the data into different columns based on some logic but then you'd end up with many missing values for all the columns you don't use for a specific input record. Or you need to read the data into separate tables and then join the tables together side by side using some key (i.e. the date).

 

Below code which stacks the data from the different files into the same columns. I've then added the data in line 3 as an additional description field. Not sure if that's what you're after but it should give you the idea how to read the data. The code is strongly based on an example from the SAS documentation found here:

http://support.sas.com/documentation/cdl/en/lestmtsref/69738/HTML/default/viewer.htm#n1rill4udj0tfun...

 

In below code for command line: %let folderPath=c:\temp; the value "c:\temp" is the location where I've stored the .csv's you've provided. Change this path to the location where you store your .csv's (as seen from the SAS server where your code executes). 

 

%let folderPath=c:\temp;
data want(drop=_:);
  length fileloc myinfile $ 300;
  input fileloc $ ; /* read instream data       */
  fileloc=cats("&folderPath\",fileloc);

  /* The INFILE statement closes the current file 
     and opens a new one if FILELOC changes value 
     when INFILE executes                        */
  infile dummy filevar=fileloc end=done dsd dlm=',' truncover;
  myinfile=fileloc;

  /* DONE set to 1 when last input record read  */
  _row_cnt=0;
  do while(not done);
    /* Read all input records from the currently             */
    /* opened input file, write to want                      */
    /* for the first 3 lines per input file: skip the record */
    _row_cnt+1;
    if _row_cnt in (1,2) then 
      do;
        input;
      end;
    else if _row_cnt =3 then 
      do;
        input _dummy:$1. Description $100.;
      end;
    else 
      do;
        input month:anydtdte. benefit loss;
        format month yymon7.;
        output;
      end;
  end;

  put 'Finished reading ' myinfile=;
  datalines;
3_CA_ABLE_loss.csv
2_CA_BENEFIT_loss.csv
1_CA_EFFECTIVE_loss.csv
;
run;

 

Tom
Super User Tom
Super User

You were almost there. You just need to add the logic to skip the lines to the code in the IF EOV block.

Looks like you data files also include information on the first line that you could read.

/*
Read all CSV files into single dataset.

Example file format:
   ----+----0----+----0
1) Category: Finance
2) 
3) Month,EFFECTIVE: (California),loss: (California)
4) 2004-01,18,47
5) 2004-02,10,41

*/
%let path= D:\CA_California ;

data import_all;
  length filename fname $256 category $40 month 8 val1 val2 8 ;
  infile "&path\*.csv" filename=fname truncover dsd dlm=',' eov=eov end=eof ;
  input @;
  filename = scan(fname,-1,'/\');
  if _n_ eq 1 or eov then do;
    input @11 category // ;
    retain category ;
  end;
  input month :anydtdte. val1 val2 ;
  format month yymm7.;
  eov=0;
run;
proc print; 
  where month between '01JAN2012'd and '01MAR2012'd ;
run;

Capture.PNG

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

That is NOT a Comma Separated Variable file.  It is an Excel file which if you save it as CSV does not become a proper CSV file, but and Excel mashup.  Generally speaking it would be quicker to do an Excel macro which loads all the Excel files one by one, removes the first few rows, then dumps out to CSV.  However you can do this in SAS as:

data want;
  infile "xyz.csv" firstobs=4;
  informat month yymm7.
  input month effective...;
  format month yymm7.;
run;

Again, as this always comes up, CSV is not Excel.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 3387 views
  • 5 likes
  • 5 in conversation