%macro read_files (out=zip_file);
%let FOLDER = C:\planinfo\IRS_DATA;
filename _tree pipe "DIR /b &FOLDER *.txt ";
data temp1;
infile _tree pad recfm=v lrecl=1024;
length file path $200;
input file $ ;
path=cats("&folder","\",file);
if file = "WindowsCodecsRaw.txt" then delete;
run;
%Local i work_dsn n_ids;
proc sql noprint;
select
count(*)
into :n_ids
from
temp1;quit;
%put n_ids=&n_ids;
%Do i = 1 %To &n_ids;
%Let last_n = %Eval(&i - 1);
Proc Sql noprint;
Select distinct path Into :filename Separated by ','
From temp1 (FirstObs=&i obs=1)
;Quit;
%put filename=&filename;
proc import
datafile= "&filename"
out=temp2
dbms=csv
replace
;
delimiter=','
;getnames=yes ;
run;
%if &i=1 %then %do;
%if %sysfunc(exist(&out)) %then %do;
Proc Sql Noprint;
Drop Table &out;
Quit;
%end;
data &out;
set temp2;
run;
%end;
%else %do;
Proc Append Base=&out Data=temp2 force; Run;
%end;
Proc Sql Noprint;
Drop Table work.temp2;
Quit;
%end;
%mend;
%read_files;
I'm utilizing the code to import all txt files to append to the first read file. It's receiving an error on the firstobs. Its reading the path correctly but for some reason I can't figure out why its not recreating the other 2 files. It seems to be replicating the first file. Any help would be appreciated. Here is the error below and a screenshot of my TEMP1 dataset.
ERROR: FIRSTOBS option > OBS option - no data to read from file WORK.TEMP1.
The OBS= option is more like a LASTOBS= option. So it worked when you had FIRSTOBS=1 and OBS=1 since that means read from 1 to 1. But then next time through the loop you asked for FIRSTOBS=2 and OBS=1 which looks like you are asking SAS to read the file backwards from 2 to 1.
Just make this change to fix that issue.
From temp1 (FirstObs=&i obs=&i)
The bigger issue that you are reading 3 files with PROC IMPORT and then expecting the results to be compatible enough that you can use PROC APPEND to combine them. That is not going to work because PROC IMPORT has to guess each time how to define the variables based solely on the examples it sees in this one version of the file.
If you know what is in the files just write your own data step to read them. Which also means you don't need the rest of that complicated macro.
%let out=zip_file;
%let FOLDER = C:\planinfo\IRS_DATA;
data &out;
length filename $200;
infile "&folder\*.txt" dsd truncover filename=filename;
input @;
* Skip header row ;
if filename ne lag(filename) then delete ;
input var1 var2 ... .;
run;
If you really want to let PROC IMPORT figure out want is in the files but you know they have the same columns in the same order then combine the files and let it work on the combined file.
filename combined temp;
data _null_;
length filename $200;
infile "&folder\*.txt" filename=filename;
file combined ;
input ;
* Skip header row ;
if filename ne lag(filename) and _n_ >1 then delete;
put _infile_;
run;
proc import datafile=combined dbms=csv out=&out replace;
getnames=yes;
run;
Update: FILENAME= not FILEVAR=
The OBS= option is more like a LASTOBS= option. So it worked when you had FIRSTOBS=1 and OBS=1 since that means read from 1 to 1. But then next time through the loop you asked for FIRSTOBS=2 and OBS=1 which looks like you are asking SAS to read the file backwards from 2 to 1.
Just make this change to fix that issue.
From temp1 (FirstObs=&i obs=&i)
The bigger issue that you are reading 3 files with PROC IMPORT and then expecting the results to be compatible enough that you can use PROC APPEND to combine them. That is not going to work because PROC IMPORT has to guess each time how to define the variables based solely on the examples it sees in this one version of the file.
If you know what is in the files just write your own data step to read them. Which also means you don't need the rest of that complicated macro.
%let out=zip_file;
%let FOLDER = C:\planinfo\IRS_DATA;
data &out;
length filename $200;
infile "&folder\*.txt" dsd truncover filename=filename;
input @;
* Skip header row ;
if filename ne lag(filename) then delete ;
input var1 var2 ... .;
run;
If you really want to let PROC IMPORT figure out want is in the files but you know they have the same columns in the same order then combine the files and let it work on the combined file.
filename combined temp;
data _null_;
length filename $200;
infile "&folder\*.txt" filename=filename;
file combined ;
input ;
* Skip header row ;
if filename ne lag(filename) and _n_ >1 then delete;
put _infile_;
run;
proc import datafile=combined dbms=csv out=&out replace;
getnames=yes;
run;
Update: FILENAME= not FILEVAR=
@Tom Thank you so much for the insight. I tried using the data step option, received an error "ERROR: A Physical file reference (i.e. "PHYSICAL FILE REFERENCE" ) or an aggregate file storage reference (i.e. AGGREGATE(MEMBER) ) reference cannot be used with the FILEVAR= option."
I saw a solution you suggested on a previous source, so i changed the FILEVAR option to Filename=Filename, i'm getting null results for all with no errors (i'll take the null errors 🙂 ) Not sure why my data is null. Any thoughts?
Heres a snippet from the log
NOTE: Invalid data for STATEFIPS in line 2 1-775.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
2 CHAR 1.AL.0.1.0_25.836320.481570.109790.233260.455560.1356760.525260.26020.16850.9170.5730.11620.255610.1
ZONE 3044030303533033333303333330333333033333303333330333333303333330333330333330333303333033333033333303
NUMR 191C909190F25983632094815709109790923326094555609135676095252609260209168509917095730911620925561091
101 49910.10855656.836310.11036309.669770.8646476.92010.59467.43130.72032.38540.47871.12050.5426.146630.
ZONE 3333303333333303333330333333330333333033333330333330333330333330333330333330333330333330333303333330
NUMR 4991091085565698363109110363099669770986464769920109594679431309720329385409478719120509542691466309
2
@bknitch wrote:
Heres a snippet from the log
NOTE: Invalid data for STATEFIPS in line 2 1-775.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----02 CHAR 1.AL.0.1.0_25.836320.481570.109790.233260.455560.1356760.525260.26020.16850.9170.5730.11620.255610.1
ZONE 3044030303533033333303333330333333033333303333330333333303333330333330333330333303333033333033333303
NUMR 191C909190F25983632094815709109790923326094555609135676095252609260209168509917095730911620925561091101 49910.10855656.836310.11036309.669770.8646476.92010.59467.43130.72032.38540.47871.12050.5426.146630.
ZONE 3333303333333303333330333333330333333033333330333330333330333330333330333330333330333330333303333330
NUMR 4991091085565698363109110363099669770986464769920109594679431309720329385409478719120509542691466309
2
Start with the DATA statement through the RUN with all the notes. Paste the LOG into a code box opened with the </> icon on the form to preserve formatting.
The error indicates that what ever the data may be the INPUT information you gave it is incompatible with the data for that line (and likely others).
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.