BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Yoshihisa-Yokoi
Calcite | Level 5

Hi everyone,

I am new to SAS programming and I am trying to read multiple CSV files into SAS. I have tried using the infile statement with a wildcard to read all the files in a folder, but it doesn't seem to be working. Can someone please guide me on how to read multiple CSV files in SAS?

Here's the code I am using:

filename files "/path/to/folder/*.csv";

data mydata;
  infile files dlm=',' firstobs=2;
  input var1 var2 var3;
run;

I am not getting any errors, but it seems like only the first file is being read. I have searched online for a solution but haven't been able to find one that works for me. Any help would be greatly appreciated.

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Your code will only skip the header line on the FIRST file it finds.

To skip all of the headers use the FILENAME= option of the INFILE statement to create a variable to contain the name of the current file so you can skip the header line of each file.

data mydata;
  length filen $256;
  infile "/path/to/folder/*.csv" dsd truncover filename=filen;
  input @@;
  if filen ne lag(filen) then delete;
  input var1 var2 var3;
run;

The variable created by the FILENAME= option is not output. So if you want to keep the name of the file that contributed the current observation then add another variable.  This code will also add a simple numeric variable to count how many files have been read.

data mydata;
  length filen $256;
  infile "/path/to/folder/*.csv" dsd truncover filename=filen;
  input @@;
  if filen ne lag(filen) then do;
     fileno+1;
     delete;
  end;
  input var1 var2 var3;
  filename=filen;
run;

View solution in original post

2 REPLIES 2
Tom
Super User Tom
Super User

Your code will only skip the header line on the FIRST file it finds.

To skip all of the headers use the FILENAME= option of the INFILE statement to create a variable to contain the name of the current file so you can skip the header line of each file.

data mydata;
  length filen $256;
  infile "/path/to/folder/*.csv" dsd truncover filename=filen;
  input @@;
  if filen ne lag(filen) then delete;
  input var1 var2 var3;
run;

The variable created by the FILENAME= option is not output. So if you want to keep the name of the file that contributed the current observation then add another variable.  This code will also add a simple numeric variable to count how many files have been read.

data mydata;
  length filen $256;
  infile "/path/to/folder/*.csv" dsd truncover filename=filen;
  input @@;
  if filen ne lag(filen) then do;
     fileno+1;
     delete;
  end;
  input var1 var2 var3;
  filename=filen;
run;
Kurt_Bremser
Super User

Since a file reference can also be used for output, wildcards don't make sense there.

Do everything in the INFILE:

data mydata;
  infile  "/path/to/folder/*.csv" dlm=',' firstobs=2;
  input var1 var2 var3;
run;

You may need to add code to skip the header line in every new file.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 954 views
  • 0 likes
  • 3 in conversation