BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
daradanye
Obsidian | Level 7

Hi,

 

I have many tsv files to be imported in the SAS.  The files are located in different subfolders named after numbers and underscore.  For example, under the subfolder 6500_317, there are five tsv files: 8491_2020, 8611_2020, 5491_2020, 7453_2020, 2312_2020

 

I am wondering how I can import all files in all subfolders into the SAS and name after as data_tsvfilename.

 

I will appreciate it very much if someone can help here.

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
/*You can use OS command to get all these TSV file path.
After that,using PROC IMPORT it one by one.
Like this :
*/
%let path_in= c:\temp ;
filename x pipe "dir &path_in.\*.tsv /s /b";
data x;
 infile x truncover;
 input path $2000.;
run;

data _null_;
 set x;
 call execute(catt('proc import datafile="',path,'" out=data_',scan(path,-2,'.\'),' dbms=csv replace;run;'));
run;

View solution in original post

3 REPLIES 3
Tom
Super User Tom
Super User

Details matter a lot and you have left out a lot of them.

 

First you need to define a TSV file.  What is that?  Let's assume you meant a tab delimited text file.

Second do all of the files have the same type of content?  The same fields in the same order?

Third do the fields in the file contain the information that appears to be encoded into the directory and file names?  Or do you need to create new variables in the generated dataset (or datasets) to store the information contained in the filename.

 

IF the files all contain the same information then it is probably more efficient to just make one dataset.  But if you need multiple datasets then you need to explain what names you want to use. Names in SAS cannot start with digits or contain hyphens.

 

The first step you need to do is get the list of files.  There are literally hundreds of posts on this site with descriptions of how to get a list of existing files into a dataset.

ChrisHemedinger
Community Manager

I take TSV to be "tab separated values". I have an example for how to accomplish this here:

 

https://blogs.sas.com/content/sasdummy/2018/10/09/read-multiple-text-files/

 

You can use wildcard notation in the INFILE statement to match the files:

 

data accel;
  infile "/home/userid/tsv/*.tsv" 
   dlm='09'x;
  length counter 8 
         timestamp 8 
         x 8 y 8 z 8 
         filename $ 25;
  input counter timestamp x y z filename;
run;

Edit: just noticed you want the subfolders too. For this, you need to gather all of the filenames ahead by querying the contents of the directories. The DOPEN and DREAD functions help with this. Sample code here in the doc.

 

Check out SAS Innovate on-demand content! Watch the main stage sessions, keynotes, and over 20 technical breakout sessions!
Ksharp
Super User
/*You can use OS command to get all these TSV file path.
After that,using PROC IMPORT it one by one.
Like this :
*/
%let path_in= c:\temp ;
filename x pipe "dir &path_in.\*.tsv /s /b";
data x;
 infile x truncover;
 input path $2000.;
run;

data _null_;
 set x;
 call execute(catt('proc import datafile="',path,'" out=data_',scan(path,-2,'.\'),' dbms=csv replace;run;'));
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 737 views
  • 0 likes
  • 4 in conversation