BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bknitch
Quartz | Level 8

%macro read_files (out=zip_file);

%let  FOLDER = C:\planinfo\IRS_DATA;

filename _tree pipe "DIR /b &FOLDER *.txt ";

data temp1;

infile _tree pad recfm=v lrecl=1024;

length file path $200;

input file $ ;
path=cats("&folder","\",file);
if file = "WindowsCodecsRaw.txt" then delete;

run;

 

%Local i work_dsn n_ids;

 

   proc sql noprint;

   select

   count(*)

   into :n_ids

   from

   temp1;quit;

 %put n_ids=&n_ids;

 

   %Do i = 1 %To &n_ids;

      %Let last_n = %Eval(&i - 1);

         Proc Sql noprint;

         Select distinct path Into :filename Separated by ','

           From temp1 (FirstObs=&i obs=1)

                ;Quit;

%put filename=&filename;


proc import

datafile= "&filename"

out=temp2

dbms=csv

replace

;

delimiter=','

;getnames=yes ;

run;


%if &i=1 %then %do;

 

%if %sysfunc(exist(&out)) %then %do;

Proc Sql Noprint;

      Drop Table &out;

   Quit;

%end;

 

data &out;

set temp2;

run;

 

%end;

%else %do;


Proc Append Base=&out Data=temp2 force; Run;

%end;



Proc Sql Noprint;

      Drop Table work.temp2;

   Quit;

 

  %end;

 

  %mend;

 

 

%read_files;

I'm utilizing the code to import all txt files to append to the first read file. It's receiving an error on the firstobs. Its reading the path correctly but for some reason I can't figure out why its not recreating the other 2 files. It seems to be replicating the first file. Any help would be appreciated. Here is the error below and a screenshot of my TEMP1 dataset.

 

ERROR: FIRSTOBS option > OBS option - no data to read from file WORK.TEMP1.

bknitch_0-1589926514055.png

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

The OBS= option is more like a LASTOBS= option.  So it worked when you had FIRSTOBS=1 and OBS=1 since that means read from 1 to 1.  But then next time through the loop you asked for FIRSTOBS=2 and OBS=1 which looks like you are asking SAS to read the file backwards from 2 to 1. 

Just make this change to fix that issue.

From temp1 (FirstObs=&i obs=&i)

The bigger issue that you are reading 3 files with PROC IMPORT and then expecting the results to be compatible enough that you can use PROC APPEND to combine them.  That is not going to work because PROC IMPORT has to guess each time how to define the variables based solely on the examples it sees in this one version of the file.  

If you know what is in the files just write your own data step to read them.  Which also means you don't need the rest of that complicated macro.

%let out=zip_file;
%let  FOLDER = C:\planinfo\IRS_DATA;

data &out;
  length filename $200; 
  infile "&folder\*.txt" dsd truncover filename=filename;
  input @;
  * Skip header row ;
  if filename ne lag(filename) then delete ;
  input var1 var2 ... .;
run;

If you really want to let PROC IMPORT figure out want is in the files but you know they have the same columns in the same order then combine the files and let it work on the combined file.

filename combined temp;
data _null_;
  length filename $200; 
  infile "&folder\*.txt"  filename=filename;
  file combined ;
  input ;
  * Skip header row ;
  if filename ne lag(filename) and _n_ >1 then delete;
  put _infile_;
run;
proc import datafile=combined dbms=csv out=&out replace;
  getnames=yes;
run;

Update: FILENAME= not FILEVAR= 

View solution in original post

7 REPLIES 7
Tom
Super User Tom
Super User

The OBS= option is more like a LASTOBS= option.  So it worked when you had FIRSTOBS=1 and OBS=1 since that means read from 1 to 1.  But then next time through the loop you asked for FIRSTOBS=2 and OBS=1 which looks like you are asking SAS to read the file backwards from 2 to 1. 

Just make this change to fix that issue.

From temp1 (FirstObs=&i obs=&i)

The bigger issue that you are reading 3 files with PROC IMPORT and then expecting the results to be compatible enough that you can use PROC APPEND to combine them.  That is not going to work because PROC IMPORT has to guess each time how to define the variables based solely on the examples it sees in this one version of the file.  

If you know what is in the files just write your own data step to read them.  Which also means you don't need the rest of that complicated macro.

%let out=zip_file;
%let  FOLDER = C:\planinfo\IRS_DATA;

data &out;
  length filename $200; 
  infile "&folder\*.txt" dsd truncover filename=filename;
  input @;
  * Skip header row ;
  if filename ne lag(filename) then delete ;
  input var1 var2 ... .;
run;

If you really want to let PROC IMPORT figure out want is in the files but you know they have the same columns in the same order then combine the files and let it work on the combined file.

filename combined temp;
data _null_;
  length filename $200; 
  infile "&folder\*.txt"  filename=filename;
  file combined ;
  input ;
  * Skip header row ;
  if filename ne lag(filename) and _n_ >1 then delete;
  put _infile_;
run;
proc import datafile=combined dbms=csv out=&out replace;
  getnames=yes;
run;

Update: FILENAME= not FILEVAR= 

bknitch
Quartz | Level 8

@Tom  Thank you so much for the insight. I tried using the data step option, received an error "ERROR: A Physical file reference (i.e. "PHYSICAL FILE REFERENCE" ) or an aggregate file storage reference (i.e. AGGREGATE(MEMBER) ) reference cannot be used with the FILEVAR= option." 

 

I saw a solution you suggested on a previous source, so i changed the FILEVAR option to Filename=Filename, i'm getting null results for all with no errors (i'll take the null errors 🙂 ) Not sure why my data is null. Any thoughts? 

Tom
Super User Tom
Super User
Sounds like you tried to just read the directory instead of actually trying to use the wildcard to name a file. You might want to add %str() around the * to prevent SAS from seeing /* as starting a comment. Either that or you have a sub directory named whose name ends in .txt.
bknitch
Quartz | Level 8

Heres a snippet from the log 

 

NOTE: Invalid data for STATEFIPS in line 2 1-775.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0

2 CHAR 1.AL.0.1.0_25.836320.481570.109790.233260.455560.1356760.525260.26020.16850.9170.5730.11620.255610.1
ZONE 3044030303533033333303333330333333033333303333330333333303333330333330333330333303333033333033333303
NUMR 191C909190F25983632094815709109790923326094555609135676095252609260209168509917095730911620925561091

101 49910.10855656.836310.11036309.669770.8646476.92010.59467.43130.72032.38540.47871.12050.5426.146630.
ZONE 3333303333333303333330333333330333333033333330333330333330333330333330333330333330333330333303333330
NUMR 4991091085565698363109110363099669770986464769920109594679431309720329385409478719120509542691466309
2

ballardw
Super User

@bknitch wrote:

Heres a snippet from the log 

 

NOTE: Invalid data for STATEFIPS in line 2 1-775.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0

2 CHAR 1.AL.0.1.0_25.836320.481570.109790.233260.455560.1356760.525260.26020.16850.9170.5730.11620.255610.1
ZONE 3044030303533033333303333330333333033333303333330333333303333330333330333330333303333033333033333303
NUMR 191C909190F25983632094815709109790923326094555609135676095252609260209168509917095730911620925561091

101 49910.10855656.836310.11036309.669770.8646476.92010.59467.43130.72032.38540.47871.12050.5426.146630.
ZONE 3333303333333303333330333333330333333033333330333330333330333330333330333330333330333330333303333330
NUMR 4991091085565698363109110363099669770986464769920109594679431309720329385409478719120509542691466309
2


Start with the DATA statement through the RUN with all the notes. Paste the LOG into a code box opened with the </> icon on the form to preserve formatting.

The error indicates that what ever the data may be the INPUT information you gave it is incompatible with the data for that line (and likely others).

Tom
Super User Tom
Super User
So your other code wasn't going to work either. That file is using TAB (09 in hexcode) as the delimiter, not comma.
bknitch
Quartz | Level 8
Thank you so much for the insight and detailed instructions! This worked like a charm and got rid of the messy macro! Thanks again.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2872 views
  • 4 likes
  • 3 in conversation