Hi,
I hope you can help me!
I've a folder of .TXT files which I try to read in SAS into on table. This is working - by itself - fine. However, I've also .TXT files for which I've to use a different INPUT / FORMAT combination based on the first two digits of each row. This is also working - by itself - fine. Now, I try to combine both which is resulting in a program which is running forever...
The code is as follows:
DATA HISTORY_&BORD_NAME;
SET FILELIST;
FILEPATH = "&DIRNAME"||FILE_NAME;
INFILE TEMP FILEVAR = FILEPATH END = DONE TRUNCOVER;
INPUT @1 RECORD_ID $2. @;
DO WHILE(NOT DONE);
IF RECORD_ID = "11" THEN
INPUT
@1 RECORD_ID $2.
@3 DEALER_NUMBER $4.
;
FORMAT
RECORD_ID $CHAR2.
DEALER_NUMBER $CHAR4.
;
ELSE IF RECORD_ID = "22" THEN
INPUT
@1 RECORD_ID $2.
@3 INVOICE_NUMBER $7.
;
FORMAT
RECORD_ID $CHAR2.
INVOICE_NUMBER $CHAR4.
;
OUTPUT;
END;
RUN;
What's the mistake in the above which is causing the "forever" run?
Many thanks!
Regards,
Sven
What's the mistake in the above which is causing the "forever" run?
Could be that one of your data files contains a record which does not begin with "11" or "22" - in which case your datastep will not execute any INPUT statements, only the final OUTPUT statement, and will be stuck in the DO loop. Or it may be a problem with the END=DONE condition, which may not be set correctly when you read from multiple infiles like that. In the last case, you may want to take a look at the EOF= option for infiles.
All in all, I would try something like this:
DATA HISTORY_&BORD_NAME;
Next_file:
SET FILELIST;
FILEPATH = "&DIRNAME"||FILE_NAME;
INFILE TEMP FILEVAR = FILEPATH EOF=Next_file TRUNCOVER;
DO WHILE(1);
INPUT @1 RECORD_ID $2. @;
IF RECORD_ID = "11" THEN
INPUT
@3 DEALER_NUMBER $4.
;
ELSE IF RECORD_ID = "22" THEN
INPUT
@3 INVOICE_NUMBER $7.
;
else do; /* In case we have an invalid(?) record ID */
input; /* skip to next line */
error; /* write the stuff to log */
continue; /* do not output */
end;
OUTPUT;
END;
FORMAT
RECORD_ID $CHAR2.
DEALER_NUMBER $CHAR4.
INVOICE_NUMBER $CHAR4.
;
RUN;
This EOF= option makes the program jump to the label indicated when a read after end of file occurs. And the SET statement terminates the datastep, when there are no more records in FILELIST.
I moved the FORMAT statement to the end, as it is not executable (it makes no sense to put it inside a conditionally executed block).
If you have two different data metadata, then write two different data import programs, simple then. You could do a pre-check of a file and call the relevant code from that:
%macro import_data (f=); data _null_; infile "&f." obs=1; input; if _input_="01" then call execute(cats('%Import_seta (f=',&f.,'));')); else call execute('cats('%impor_setb (f=',&f.,'));')); run; %mend import_data; data _null_; set listoffiles; call execute(cats('%import_data (f=',filemname,');)); run;
Something like that, so for each file, a line is read in, if the text=per the if, then call one macro else call another. Of course I am flying blind here in your process, this is just an example.
Also, please avoid coding in capitals, it really makes it unreadable.
What's the mistake in the above which is causing the "forever" run?
Could be that one of your data files contains a record which does not begin with "11" or "22" - in which case your datastep will not execute any INPUT statements, only the final OUTPUT statement, and will be stuck in the DO loop. Or it may be a problem with the END=DONE condition, which may not be set correctly when you read from multiple infiles like that. In the last case, you may want to take a look at the EOF= option for infiles.
All in all, I would try something like this:
DATA HISTORY_&BORD_NAME;
Next_file:
SET FILELIST;
FILEPATH = "&DIRNAME"||FILE_NAME;
INFILE TEMP FILEVAR = FILEPATH EOF=Next_file TRUNCOVER;
DO WHILE(1);
INPUT @1 RECORD_ID $2. @;
IF RECORD_ID = "11" THEN
INPUT
@3 DEALER_NUMBER $4.
;
ELSE IF RECORD_ID = "22" THEN
INPUT
@3 INVOICE_NUMBER $7.
;
else do; /* In case we have an invalid(?) record ID */
input; /* skip to next line */
error; /* write the stuff to log */
continue; /* do not output */
end;
OUTPUT;
END;
FORMAT
RECORD_ID $CHAR2.
DEALER_NUMBER $CHAR4.
INVOICE_NUMBER $CHAR4.
;
RUN;
This EOF= option makes the program jump to the label indicated when a read after end of file occurs. And the SET statement terminates the datastep, when there are no more records in FILELIST.
I moved the FORMAT statement to the end, as it is not executable (it makes no sense to put it inside a conditionally executed block).
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.