How do I import the names of files (different types) in a folder along with the data.

Accepted Solution Solved
Reply
Occasional Contributor Aj
Occasional Contributor
Posts: 12
Accepted Solution

How do I import the names of files (different types) in a folder along with the data.

Hi Can any one please help me to find a way to do the following: 

 

I have multiple files in a folder, of different formats (.tmp, .processed, .orgi) :  I want to import them at all at once and do the following.

 

1. Code to read in just the file names in the folder. 

 

2. Code to import the name of the file as a seperate column, along with the data in the file.

 

3. Code to read in the file name as the data set name for each file in the data. 


Accepted Solutions
Solution
‎04-26-2016 01:59 PM
Super User
Super User
Posts: 7,401

Re: How do I import the names of files (different types) in a folder along with the data.

Well, 1 is as simple as running a dir listing in via pipe and the second point is a simple scan:

filename tmp pipe 'dir "c:\directory\*.*" /b';
data want;
  length buffer $2000;
  infile tmp;
  input buffer $;
filename=scan(buffer,1,"."); run;

Point 3 is not clear at all to me.  Please clarify.  Are you planning on reading these files - of mixed type, and probably mixed structure - in?  If so then your going to need a process for each different file, and strcuture. 

View solution in original post


All Replies
Super User
Posts: 10,500

Re: How do I import the names of files (different types) in a folder along with the data.

Several ways to do the first, some are operating system dependent and there may be permission settings if working in a server enviroment.

 

The SAS functions DOPEN DREAD and associated functions are intended to read directory information.

 

The second is going to require, as a minimum that you specify what type of file to read for each file. If you are expecting to use Proc Import that would be the DBMS statement but you will have to say what type of file in terms SAS understands. I promise there is no standard file type for processed and orgi extensions the tmp. If these are consistently delimited files such as CSV or Tab or other special character you may have a chance. If the data is fixed column or some report type file with different lines needing different processing you are SOL. You would also have to add the file name after importing as that is not a standard option with proc import.

Otherwise you're going to have to write a program to read each file layout.

I've been involved in a project where all of the files were supposedly the same layout but different variable lengths, numbers of rows per record and variables per line required tweeking a "standard" code template in about 10 places to finally read the 60 or so files involved. So I suspect that you will learn to write or modify datastep input code for this project.

 

How many total files are you needing to work with?

 

 

The third may not be possible at all as 1) period is not acceptable in a SAS data set name, 2) SAS only allows letters, numbers and _ to appear in dataset names 3) you may have an issue with overall length of the file name.

Occasional Contributor Aj
Occasional Contributor
Posts: 12

Re: How do I import the names of files (different types) in a folder along with the data.

Hi Ballard,

 

I have around 70 different files with same structure which have pipe delimeter. The structure is same and they are all text files with .txt.processed,

.txt,

.txt.orgi 

.txt.orgi   as the type of the files.

 

I was able to read the names of the files into a data set as mentioned by the User 'RW9'. 

 

I can import all the files at once to a single dataset, using the code 

infile ('C:\path\*.log' 'C:\path\*.gti' 'c:\path\*.txt') .

But i want to get the corresponding filenames from those files, added as a seperate column. 

 

 

 

 

Solution
‎04-26-2016 01:59 PM
Super User
Super User
Posts: 7,401

Re: How do I import the names of files (different types) in a folder along with the data.

Well, 1 is as simple as running a dir listing in via pipe and the second point is a simple scan:

filename tmp pipe 'dir "c:\directory\*.*" /b';
data want;
  length buffer $2000;
  infile tmp;
  input buffer $;
filename=scan(buffer,1,"."); run;

Point 3 is not clear at all to me.  Please clarify.  Are you planning on reading these files - of mixed type, and probably mixed structure - in?  If so then your going to need a process for each different file, and strcuture. 

Occasional Contributor Aj
Occasional Contributor
Posts: 12

Re: How do I import the names of files (different types) in a folder along with the data.

I have around 70 different files with same structure which have pipe delimeter. The structure is same and they are all text files with .txt.processed,

.txt,

.txt.orgi 

.txt.orgi   as the type of the files.

 

I was able to read the names of the files into a data set as mentioned by you. 

 

I can import all the files at once to a single dataset, using the code 

infile ('C:\path\*.log' 'C:\path\*.gti' 'c:\path\*.txt') .

But i want to get the corresponding filenames from those files, added as a seperate column. 

 

I agree, the last suggestion is too complicated.sorry to confuse you Smiley Happy

Super User
Posts: 10,500

Re: How do I import the names of files (different types) in a folder along with the data.

Add a variable to store the name and  the option FILENAME= to your infile statement.

 

data read;

   Length InputFile $ 200. ; /* or declare a length long enough to hold the path and file name*/

   infile ('C:\path\*.log' 'C:\path\*.gti' 'c:\path\*.txt')  dlm='|'    Filename=Tfile;

   InputFile = Tfile;

   /* rest of your input*/

run;

Occasional Contributor Aj
Occasional Contributor
Posts: 12

Re: How do I import the names of files (different types) in a folder along with the data.

Hi ballard,

 

I tried your code and gave a lenth of $ 5000.But it doesn't copy the whole path.

It stops at C:\Users. I tried it from different directory, but result is same. 

What do you think is the problem here. 

 

Thanks,

Super User
Posts: 10,500

Re: How do I import the names of files (different types) in a folder along with the data.

[ Edited ]

Both the permanent variable and the FILENAME variable need to be declared for length. Sorry I missed that. Put both variables on the LENGTH statement with the same length. Otherwise a character variable defaults to 8 which would stop at C:\USERS

 

 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 417 views
  • 0 likes
  • 3 in conversation