BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
daradanye
Obsidian | Level 7

Hi,

 

I have many tsv files to be imported in the SAS.  The files are located in different subfolders named after numbers and underscore.  For example, under the subfolder 6500_317, there are five tsv files: 8491_2020, 8611_2020, 5491_2020, 7453_2020, 2312_2020

 

I am wondering how I can import all files in all subfolders into the SAS and name after as data_tsvfilename.

 

I will appreciate it very much if someone can help here.

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
/*You can use OS command to get all these TSV file path.
After that,using PROC IMPORT it one by one.
Like this :
*/
%let path_in= c:\temp ;
filename x pipe "dir &path_in.\*.tsv /s /b";
data x;
 infile x truncover;
 input path $2000.;
run;

data _null_;
 set x;
 call execute(catt('proc import datafile="',path,'" out=data_',scan(path,-2,'.\'),' dbms=csv replace;run;'));
run;

View solution in original post

3 REPLIES 3
Tom
Super User Tom
Super User

Details matter a lot and you have left out a lot of them.

 

First you need to define a TSV file.  What is that?  Let's assume you meant a tab delimited text file.

Second do all of the files have the same type of content?  The same fields in the same order?

Third do the fields in the file contain the information that appears to be encoded into the directory and file names?  Or do you need to create new variables in the generated dataset (or datasets) to store the information contained in the filename.

 

IF the files all contain the same information then it is probably more efficient to just make one dataset.  But if you need multiple datasets then you need to explain what names you want to use. Names in SAS cannot start with digits or contain hyphens.

 

The first step you need to do is get the list of files.  There are literally hundreds of posts on this site with descriptions of how to get a list of existing files into a dataset.

ChrisHemedinger
Community Manager

I take TSV to be "tab separated values". I have an example for how to accomplish this here:

 

https://blogs.sas.com/content/sasdummy/2018/10/09/read-multiple-text-files/

 

You can use wildcard notation in the INFILE statement to match the files:

 

data accel;
  infile "/home/userid/tsv/*.tsv" 
   dlm='09'x;
  length counter 8 
         timestamp 8 
         x 8 y 8 z 8 
         filename $ 25;
  input counter timestamp x y z filename;
run;

Edit: just noticed you want the subfolders too. For this, you need to gather all of the filenames ahead by querying the contents of the directories. The DOPEN and DREAD functions help with this. Sample code here in the doc.

 

SAS Hackathon registration is open! Build your skills. Make connections. Enjoy creative freedom. Maybe change the world.
Ksharp
Super User
/*You can use OS command to get all these TSV file path.
After that,using PROC IMPORT it one by one.
Like this :
*/
%let path_in= c:\temp ;
filename x pipe "dir &path_in.\*.tsv /s /b";
data x;
 infile x truncover;
 input path $2000.;
run;

data _null_;
 set x;
 call execute(catt('proc import datafile="',path,'" out=data_',scan(path,-2,'.\'),' dbms=csv replace;run;'));
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 780 views
  • 0 likes
  • 4 in conversation