without the "delete;", the first row of every file is blank -- all are missing values.
With "delete;" in the do loop, the first row of every file is skipped, so in the final merged file, the number rows is 364 rows less compared to the merged file without "delete;".
data import_all;
*make sure variables to store file name are long enough;
length filename txt_file_name $256;
*keep file name from record to record;
retain txt_file_name;
*Use wildcard in input;
infile "D:\data\Transaction2014*.csv" eov=eov filename=filename truncover delimiter = ',' MISSOVER DSD lrecl=32767;
informat VAR1 yymmdd10. ;
informat VAR2 $40. ;
.... /* to save space */
informat VAR26 $4. ;
informat VAR27 $19. ;
informat VAR28 anydtdtm40. ;
format VAR1 yymmdd10. ;
format VAR2 $40. ;
.... /* to save space */
format VAR26 $4. ;
format VAR27 $19. ;
format VAR28 datetime. ;
*Input first record and hold line;
input@;
*Check if this is the first record or the first record in a new file;
*If it is, replace the filename with the new file name and move to next line;
if _n_ eq 1 or eov then do;
txt_file_name = scan(filename, -1, "\");
eov=0; delete; /* with or without it, the first row will be either missing values or skipped */
end;
*Otherwise go to the import step and read the files;
else do;
input
/*Place input code here;*/
VAR1
VAR2 $
.... /* to save space */
VAR26 $
VAR27 $
VAR28
;
end;
run;
I found out the problem: the author says somewhere else, that
if _n_ eq 1 or eov then do;
txt_file_name = scan(filename, -1, "\");
eov=0;
end;
assumes that each file has column headers ans uses the EOV option to account for it.
In my case, each file has no column headers. But, how should I change the code accordingly?
... View more