Hi
I have a infile code that work for 1 txt file, recently want to enhance it to infile more than 1 txt file into 1 dataset. Code can proceeded and i'm facing difficulty to slot in 1 programming command(to delete empty and unwanted row) as result shown 1st record read and then it jump to 2nd txt file, instead of contnue till end of the 1st txt file.
my code as below:-
filename fn pipe 'dir C:\Users\Desktop\testing\*.txt /s /b';
data new;
infile fn;
input name:&$100.;
infile in filevar=name end=last firstobs=11 missover dsd ;
retain Grp_AC No Batch_no TRX_CODE Amount Block_Code ;
format Grp_AC $char11.
Batch_no $char14.
Block_Code $char4.
No $char16.
TRX_CODE $char3.;
do while(not last);
Input Grp_AC $ 1-24 @;
file_name = scan( scan( name, -1, '\' ), 1, '.' );
input Grp_AC $ 1-24 No $ 31-48 Batch_no $ 52-66 TRX_CODE $ 72-76 @86 Amount trailsgn16. Block_Code $ 110-113 ;
output;
end;
run;
wanna to insert code below:-
if Grp_AC = 'Grp A' or Grp_AC = '' or Grp_AC = 'OVER' or Grp_AC = 'Total Count' then delete;
Thank in advance.
Because you have the INPUT statements for the data files inside of a DO loop you need to move to the next record when you are on one that you want to skip. Add and an
else input ;
to the IF/THEN statement.
Assuming all input text files have the same format and should be treated the same
enter your code - dealing with the variables just read - just before the OUTPUT statement.
Hi Shmuel,
Did try below, but it only read 1 record from each txt file. Think the delete stop the current process and went to next txt file
filename fn pipe 'dir C:\Users\Desktop\testing\*.txt /s /b';
data new;
infile fn;
input name:&$100.;
infile in filevar=name end=last firstobs=11 missover dsd ;
retain Grp_AC No Batch_no TRX_CODE Amount Block_Code ;
format Grp_AC $char11.
Batch_no $char14.
Block_Code $char4.
No $char16.
TRX_CODE $char3.;
do while(not last);
Input Grp_AC $ 1-24 @;
file_name = scan( scan( name, -1, '\' ), 1, '.' );
if Grp_AC = 'Grp A' or Grp_AC = '' or Grp_AC = 'OVER' or Grp_AC = 'Total Count' then delete;
input Grp_AC $ 1-24 No $ 31-48 Batch_no $ 52-66 TRX_CODE $ 72-76 @86 Amount trailsgn16. Block_Code $ 110-113 ;
output;
end;
run;
See an example here
Use EOV on the INFILE to identify the first record.
Try next code:
filename fn pipe 'dir C:\Users\Desktop\testing\*.txt /s /b';
data new;
infile fn EOV=Done; /*** Option EOD added ***/
input name:&$100.;
infile in filevar=name end=last firstobs=11 missover dsd ;
retain Grp_AC No Batch_no TRX_CODE Amount Block_Code ;
format Grp_AC $char11.
Batch_no $char14.
Block_Code $char4.
No $char16.
TRX_CODE $char3.;
do while(not last);
Input Grp_AC $ 1-24 @;
file_name = scan( scan( name, -1, '\' ), 1, '.' );
input Grp_AC $ 1-24 No $ 31-48 Batch_no $ 52-66 TRX_CODE $ 72-76 @86 Amount trailsgn16. Block_Code $ 110-113 ;
/*** place of the if changed ***/
if Grp_AC = 'Grp A' or Grp_AC = '' or Grp_AC = 'OVER' or Grp_AC = 'Total Count' then delete;
output;
end;
run;
You don't want to use the DELETE statement since that will end the data step iteration and start from the next file in the input list.
So instead of
if (condition) then delete;
You want to code
if not (condition) then output;
So your program might look like this:
data new;
infile 'dir C:\Users\Desktop\testing\*.txt /s /b' pipe truncover;
length Grp_AC $11 No $16 Batch_no $14
TRX_CODE $3 Amount 8 Block_Code $4
;
input name $256. ;
file_name = scan( scan( name, -1, '\' ), 1, '.' );
infile in filevar=name end=last firstobs=11 truncover ;
do while (not last);
input Grp_AC $ 1-24 @;
if not (Grp_AC in ('Grp A' , ' ', 'OVER', 'Total Count') ) then do;
input Grp_AC $ 1-24
No $ 31-48
Batch_no $ 52-66
TRX_CODE $ 72-76
@86 Amount trailsgn16.
Block_Code $ 110-113
;
output;
end;
end;
run;
Reply number one:
A lot of managed SAS environments won't let you use the "X" statement, so here's an alternate, really easy way to get all of the datasets in a directory. You just need to add a bit of code to select only the ".txt" datasets.
/* Set the directory name to scan */
%let DirectoryName = C:\Users\Desktop\testing;
data DatasetsInADirectory;
keep DirectoryAndDataset;
length DatasetName DirectoryAndDataset $1024 FileRef $8;
/* Assign the fileref */
call missing(FileRef); /* Blank, so SAS will assign a file name */
rc1 = filename(FileRef, "&DirectoryName."); /* Associate the file name with the directory */
if rc1 ^= 0 then
abort;
/* Open the directory for access by SAS */
DirectoryID = dopen(FileRef);
if DirectoryID = 0 then
abort;
/* Get the count of directories and datasets */
MemberCount = dnum(DirectoryID);
if MemberCount = 0 then
abort;
/* Get all of the entry names ... directories and datasets */
do MemberIndex = 1 to MemberCount;
DatasetName = dread(DirectoryID, MemberIndex);
if missing(DatasetName) then
abort;
DirectoryAndDataset = cats("&DirectoryName.","/",DatasetName);
output;
end;
/* Close the directory */
rc2 = dclose(DirectoryID);
if rc2 ^= 0 then
abort;
run;
Reply number two
Ordinarily, I hesitate to recommend SAS macros to less-experienced users. But I also hate to see "do until" type processing of input datasets.
So here's the simplest possible macro which will give you similar results. You'll notice it has the advantage that although there are more steps, each step is extremely simple, and stands alone.
It uses the results of my first reply.
Tom
%macro GetTXTData(DsnToGet);
data temp;
infile "&DsnToGet."; /* You had additional infile options, which would go here */
Input Grp_AC $; /* Again, change as required */
/* additional processing here */
run;
proc datasets lib=work nolist;
append data=temp base=new;
delete temp;
quit;
%mend;
proc datasets lib=work nolist;
delete new;
quit;
data _null_;
length SASMacroCall $1024;
set DatasetsInADirectory;
SASMacroCall = %nrstr(cats("%GetTXTData(", strip(DirectoryAndDataset), ");"));
call execute(SASMacroCall);
run;
Hi Reeza/shmuel,
thank for the code and guidance, i actually have 4 txt file, try below:-
filename fn pipe 'dir C:\Users\Desktop\testing\*.txt /s /b';
data new;
infile fn EOV=Done; /*** Option EOD added ***/
input name:&$100.;
infile in filevar=name end=last firstobs=11 missover dsd ;
retain Grp_AC No Batch_no TRX_CODE Amount Block_Code ;
format Grp_AC $char11.
Batch_no $char14.
Block_Code $char4.
No $char16.
TRX_CODE $char3.;
do while(not last);
Input Grp_AC $ 1-24 @;
file_name = scan( scan( name, -1, '\' ), 1, '.' );
input Grp_AC $ 1-24 No $ 31-48 Batch_no $ 52-66 TRX_CODE $ 72-76 @86 Amount trailsgn16. Block_Code $ 110-113 ;
/*** place of the if changed ***/
if Grp_AC = 'Grp A' or Grp_AC = '' or Grp_AC = 'OVER' or Grp_AC = 'Total Count' then delete;
output;
end;
run;
only get total 4 record from 4 txt file.
Hi Tom,
Try run the code and it take long time to process(1 hour waited) and still running. I stop the code running and check the log, seem it stuck at 2nd record.
data new;
infile 'dir C:\Users\Desktop\testing\*.txt /s /b' pipe truncover;
length Grp_AC $11 No $16 Batch_no $14
TRX_CODE $3 Amount 8 Block_Code $4
;
input name $256. ;
file_name = scan( scan( name, -1, '\' ), 1, '.' );
infile in filevar=name end=last firstobs=11 truncover ;
do while (not last);
input Grp_AC $ 1-24 @;
if not (Grp_AC in ('Grp A' , ' ', 'OVER', 'Total Count') ) then do;
input Grp_AC $ 1-24
No $ 31-48
Batch_no $ 52-66
TRX_CODE $ 72-76
@86 Amount trailsgn16.
Block_Code $ 110-113
;
output;
end;
end;
run;
Hi TomKari,
Thank for your code, tried the code and it actually import the data to SAS. Problem is how do I get rid of unwanted row like blank and header.
below is those row I don't want:-
Grp_AC = 'Grp A'
Grp_AC = ''
Grp_AC = 'OVER'
Grp_AC = 'Total Count'
where should I insert those code, assume my code as below or any other alternative.
if Grp_AC = 'Grp A' or Grp_AC = '' or Grp_AC = 'OVER' or Grp_AC = 'Total Count' then delete;
That's the advantage to this approach, to make up for having to deal with macros. The "data" step inside the macro is just a normal SAS data step that creates one output SAS record for every input dataset record.
So to get rid of unwanted records, all you need to do is this (also, if you haven't already done it, change your "input" statement and associated processing as you had them in your original post".
Input Grp_AC $; /* Again, change as required */
/* additional processing here */
if Grp_AC = 'Grp A' or Grp_AC = '' or Grp_AC = 'OVER' or Grp_AC = 'Total Count' then delete;
run;
Tom
Because you have the INPUT statements for the data files inside of a DO loop you need to move to the next record when you are on one that you want to skip. Add and an
else input ;
to the IF/THEN statement.
Hi Tom,
Apologize for late, just wonder if any way to skip that record(unwanted row) instead of insert empty row?
thanks
@sagulolo wrote:
Hi Tom,
Apologize for late, just wonder if any way to skip that record(unwanted row) instead of insert empty row?
thanks
The code I posted should only output records that meet your criteria. This is because when you include an explicit OUTPUT statement then SAS will not automatically output the observation at the end of the data step like it normally does.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.