DATA Step, Macro, Functions and more

Reading csv filename as a variable column in the dataset

Reply
Established User
Posts: 1

Reading csv filename as a variable column in the dataset

Hi All,

 

I am trying to use the INFILE process to read a csv file.

 

However, the CSV file name contains a date (YYYYMMDD format) followed by a string which I need to capture as additional columns in the dataset that is created by the INFILE statement.

 

The resulting dataset I am looking for is:

Columns A - X: contents of the CSV file

Column Y: Date from the filename

Column Z: Part of a string from the filename.

 

it would be great if I can read multiple CSV files at once and read the date and string from each file name and append to the columns in the dataset.

 

Thanks for your help.

 

My code to simple read the file is:

 

DATA WORK.ALL1;

    LENGTH

        SOURCE_Name_1    $ 10

        SOURCE_Name_2    $ 50;

    FORMAT

        SOURCE_Name_1    $CHAR10.

        SOURCE_Name_2    $CHAR50.;

    INFORMAT

        SOURCE_Name_1    $CHAR10.

        SOURCE_Name_2    $CHAR50.;

    INFILE '\\filepath\*.csv'

           LRECL=32767

           DLM=','

          DSD

           FIRSTOBS=2

           MISSOVER;

    INPUT

        SOURCE_Name_1    : $CHAR10.

        SOURCE_Name_2    : $CHAR50.;

RUN;

 

Trusted Advisor
Posts: 1,831

Re: Reading csv filename as a variable column in the dataset

[ Edited ]

Use the FILENAME= option in the INFILE statement to get the current csv input file

 

length csv_name _filepath $200;
infile '//filepath/*.csv'   filename = _filepath ... ;

csv_name = _filename;

then extract the date and required column using SUBSTR function.

Super User
Posts: 9,923

Re: Reading csv filename as a variable column in the dataset

[ Edited ]

Use the filename= option in the infile statement:

data work.all1;
length infilename $200;
infile "&filepath.\*.csv"
  lrecl=32767
  dlm=','
  dsd
  firstobs=2
  truncover
  filename=infilename
;

Now the variable infilename will hold the name of the current infile.

Be aware that the firstobs= option works only on the first file, header lines in the succeeding files need to be filtered in the step.

 

Edit: corrected filevar= to filename=

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Ask a Question
Discussion stats
  • 2 replies
  • 203 views
  • 0 likes
  • 3 in conversation