DATA Step, Macro, Functions and more

Read multiple .TXT files into SAS with a different format for each row

Accepted Solution Solved
Reply
New Contributor
Posts: 2
Accepted Solution

Read multiple .TXT files into SAS with a different format for each row

Hi,

 

I hope you can help me!

 

I've a folder of .TXT files which I try to read in SAS into on table. This is working - by itself - fine. However, I've also .TXT files for which I've to use a different INPUT / FORMAT combination based on the first two digits of each row. This is also working - by itself - fine. Now, I try to combine both which is resulting in a program which is running forever...


The code is as follows:

 

DATA HISTORY_&BORD_NAME;
SET FILELIST;

FILEPATH = "&DIRNAME"||FILE_NAME;
INFILE TEMP FILEVAR = FILEPATH END = DONE TRUNCOVER;

INPUT @1 RECORD_ID $2. @;

DO WHILE(NOT DONE);
	IF RECORD_ID = "11" THEN
		INPUT
		        @1   RECORD_ID	 		$2.
		        @3   DEALER_NUMBER		$4.
		;
		FORMAT
				RECORD_ID		 		$CHAR2.
				DEALER_NUMBER    		$CHAR4.
		;
	ELSE IF RECORD_ID = "22" THEN 
		INPUT 
	        	@1   RECORD_ID	 		$2.
		    	@3   INVOICE_NUMBER		$7.
		;	
		FORMAT
				RECORD_ID		 		$CHAR2.
				INVOICE_NUMBER    		$CHAR4.
		;
	OUTPUT;
END;

RUN;

What's the mistake in the above which is causing the "forever" run?

 

Many thanks!


Regards,


Sven


Accepted Solutions
Solution
‎02-25-2018 08:21 AM
PROC Star
Posts: 275

Re: Read multiple .TXT files into SAS with a different format for each row

[ Edited ]

What's the mistake in the above which is causing the "forever" run?

Could be that one of your data files contains a record which does not begin with "11" or "22" - in which case your datastep will not execute any INPUT statements, only the final OUTPUT statement, and will be stuck in the DO loop. Or it may be a problem with the END=DONE condition, which may not be set correctly when you read from multiple infiles like that. In the last case, you may want to take a look at the EOF= option for infiles.

 

All in all, I would try something like this:

DATA HISTORY_&BORD_NAME;
  Next_file:
  SET FILELIST;

  FILEPATH = "&DIRNAME"||FILE_NAME;
  INFILE TEMP FILEVAR = FILEPATH EOF=Next_file TRUNCOVER;

  DO WHILE(1);
    INPUT @1 RECORD_ID $2. @;
    IF RECORD_ID = "11" THEN
      INPUT
        @3   DEALER_NUMBER            $4.
        ;
    ELSE IF RECORD_ID = "22" THEN
      INPUT
        @3   INVOICE_NUMBER            $7.
        ;
    else do;  /* In case we have an invalid(?) record ID */
      input;  /* skip to next line */
      error;  /* write the stuff to log */
      continue; /* do not output */
      end;
    OUTPUT;
    END;
 FORMAT
   RECORD_ID        $CHAR2.
   DEALER_NUMBER    $CHAR4.
   INVOICE_NUMBER   $CHAR4.
   ;
RUN;

This EOF= option makes the program jump to the label indicated when a read after end of file occurs. And the SET statement terminates the datastep, when there are no more records in FILELIST.

 

I moved the FORMAT statement to the end, as it is not executable (it makes no sense to put it inside a conditionally executed block).

View solution in original post


All Replies
Super User
Super User
Posts: 9,840

Re: Read multiple .TXT files into SAS with a different format for each row

If you have two different data metadata, then write two different data import programs, simple then.  You could do a pre-check of a file and call the relevant code from that:

%macro import_data (f=);
  data _null_;
    infile "&f." obs=1;
    input;
    if _input_="01" then call execute(cats('%Import_seta (f=',&f.,'));'));
    else call execute('cats('%impor_setb (f=',&f.,'));'));
  run;

%mend import_data;



data _null_;
  set listoffiles;
  call execute(cats('%import_data (f=',filemname,');));
run;

Something like that, so for each file, a line is read in, if the text=per the if, then call one macro else call another.  Of course I am flying blind here in your process, this is just an example.

 

Also, please avoid coding in capitals, it really makes it unreadable.

Solution
‎02-25-2018 08:21 AM
PROC Star
Posts: 275

Re: Read multiple .TXT files into SAS with a different format for each row

[ Edited ]

What's the mistake in the above which is causing the "forever" run?

Could be that one of your data files contains a record which does not begin with "11" or "22" - in which case your datastep will not execute any INPUT statements, only the final OUTPUT statement, and will be stuck in the DO loop. Or it may be a problem with the END=DONE condition, which may not be set correctly when you read from multiple infiles like that. In the last case, you may want to take a look at the EOF= option for infiles.

 

All in all, I would try something like this:

DATA HISTORY_&BORD_NAME;
  Next_file:
  SET FILELIST;

  FILEPATH = "&DIRNAME"||FILE_NAME;
  INFILE TEMP FILEVAR = FILEPATH EOF=Next_file TRUNCOVER;

  DO WHILE(1);
    INPUT @1 RECORD_ID $2. @;
    IF RECORD_ID = "11" THEN
      INPUT
        @3   DEALER_NUMBER            $4.
        ;
    ELSE IF RECORD_ID = "22" THEN
      INPUT
        @3   INVOICE_NUMBER            $7.
        ;
    else do;  /* In case we have an invalid(?) record ID */
      input;  /* skip to next line */
      error;  /* write the stuff to log */
      continue; /* do not output */
      end;
    OUTPUT;
    END;
 FORMAT
   RECORD_ID        $CHAR2.
   DEALER_NUMBER    $CHAR4.
   INVOICE_NUMBER   $CHAR4.
   ;
RUN;

This EOF= option makes the program jump to the label indicated when a read after end of file occurs. And the SET statement terminates the datastep, when there are no more records in FILELIST.

 

I moved the FORMAT statement to the end, as it is not executable (it makes no sense to put it inside a conditionally executed block).

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 98 views
  • 1 like
  • 3 in conversation