I am using Enterprise Guide 7.4 running SAS 9.4 under Windows 10.
Problem: I have hundreds of files with the same format (I'll attach three examples). There is a line of text alternating with a line of numbers. I want to read all the files in a folder. Each file needs to be reorganized by making the text string a new variable paired with the number string. The name of each file is also added for each row of data in the data file. The results from all files are appended and saved.
This example comes from http://support.sas.com/kb/41/880.html, with minor modifications.
filename DIRLIST pipe 'dir "C:\Users\tebert\Desktop\Data_Folder\*.txt" /b';
data dirlist ;
infile dirlist lrecl=200 truncover;
input file_name $100.;
run;
data _null_;
set dirlist end=end;
count+1;
call symputx('read'||put(count,10.-l),cats('C:\Users\tebert\Desktop\Data_Folder\',file_name));
call symputx('dset'||put(count,10.-l),scan(file_name,1,'.'));
if end then call symputx('max',count);
run;
options mprint symbolgen;
%macro readin;
%do i=1 %to &max;
data &&dset&i;
infile "&&read&i" lrecl=1000 truncover dsd missover;
input var1 $ var2 $ var3 $ var4 $ var5 $ var6 $ var7 $ var8 $ var9 $ var10 $;
run;
%end;
%mend readin;
%readin;
run;
It runs, but I have three questions:
1) I would like to run the attached macro on each file.
2) The program reads each file saved in dirlist. I would like to have the file name added to each row of data in the file.
3) I would like to append all of the files to make one data file.
Here is the macro:
%Macro Manip;
Data one; set one;
retain insectno2;
if insectno2<1 then insectno2=0;
insectno2=insectno2+1;
Data two three; set one;
if mod(insectno2,2) eq 1 then output two;
if mod(insectno2,2) eq 0 then output three;
Data two; set two; drop Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 insectno2;
Data two; set two;
retain insectno2;
if insectno2<1 then insectno2=0;
insectno2=insectno2+1;
data three; set three;
var9=var8; var8=var7; var7=var6; var6=var5; var5=var4; var4=var3; var3=var2; var2=var1; var1=var1; /*Use "waveform;" in place of "var1;" to switch to TBF values.*/;
data three; set three; drop waveform insectno2;
data three; set three;
retain insectno2;
if insectno2<1 then insectno2=0;
insectno2=insectno2+1;
data two; set two three; merge two three; by insectno2;
data two; set two; drop Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9;
%mend;
Thank you for your help.
Regards,
Tim
I am not going to pretend to get what the "insect" coding is doing.
You might want to take a look at this example and see how close it gets to your 2 and 3 requirements.
Note that reading from the same folder with the same file format into a single data set is very simple when header rows are not involved.
data want; length file fn $ 100; infile "C:\Users\tebert\Desktop\Data_Folder\*.txt" lrecl=1000 dsd missover dlm=',' filename=fn; informat line $100. var1-var10 best16.; input line / var1-var10; file=fn; run;
The filename option on the INFILE statement keeps track of the name of the current file read but is temporary (it will not be added to the output data set) so you have to create another variable to save the value in. Since the path + file is more than 8 characters you want to set a length for the filename and permanent variable BEFORE it is referenced the first time.
The / in the input is to read to two line structure in to a single record. The dlm=',' is because with the first line in quotes it is easy to read as comma delimited as the first line will terminate with the end of line character after the second quote.
I have no idea why you were reading the data into character variables. I you need that then set the informat accordingly.
See this post on how to modify your current code to create an automated process to read all in at once.
I am not going to pretend to get what the "insect" coding is doing.
You might want to take a look at this example and see how close it gets to your 2 and 3 requirements.
Note that reading from the same folder with the same file format into a single data set is very simple when header rows are not involved.
data want; length file fn $ 100; infile "C:\Users\tebert\Desktop\Data_Folder\*.txt" lrecl=1000 dsd missover dlm=',' filename=fn; informat line $100. var1-var10 best16.; input line / var1-var10; file=fn; run;
The filename option on the INFILE statement keeps track of the name of the current file read but is temporary (it will not be added to the output data set) so you have to create another variable to save the value in. Since the path + file is more than 8 characters you want to set a length for the filename and permanent variable BEFORE it is referenced the first time.
The / in the input is to read to two line structure in to a single record. The dlm=',' is because with the first line in quotes it is easy to read as comma delimited as the first line will terminate with the end of line character after the second quote.
I have no idea why you were reading the data into character variables. I you need that then set the informat accordingly.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.