BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sagulolo
Quartz | Level 8

Hi

 

I have a infile code that work for 1 txt file, recently want to enhance it to infile more than 1 txt file into 1 dataset. Code can proceeded and i'm facing difficulty to slot in 1 programming command(to delete empty and unwanted row) as result shown 1st record read and then it jump to 2nd txt file, instead of contnue till end of the 1st txt file.

 

my code as below:-

 

 

filename fn pipe 'dir C:\Users\Desktop\testing\*.txt /s /b';

data new;
infile fn;
input name:&$100.;

infile in filevar=name end=last firstobs=11 missover dsd ;

retain Grp_AC    No	Batch_no	TRX_CODE   Amount	Block_Code ; 

format Grp_AC $char11.
	   Batch_no $char14.
	   Block_Code $char4.
	   No $char16.
	   TRX_CODE $char3.;

do while(not last);

Input  Grp_AC $ 1-24 @;
file_name = scan( scan( name, -1, '\' ), 1, '.' );

input    Grp_AC $ 1-24    No $ 31-48  Batch_no $  52-66   TRX_CODE $ 72-76  @86   Amount  trailsgn16. Block_Code  $ 110-113 ;

output;
end;
run;

wanna to insert code below:-

 

 

if Grp_AC = 'Grp A' or Grp_AC = '' or Grp_AC = 'OVER' or Grp_AC = 'Total Count'  then delete;

 

 

Thank in advance.

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Because you have the INPUT statements for the data files inside of a DO loop you need to move to the next record when you are on one that you want to skip.  Add and an 

else input ;

to the IF/THEN statement.

View solution in original post

24 REPLIES 24
Shmuel
Garnet | Level 18

Assuming all input text files have the same format and should be treated the same

enter your code - dealing with the variables just read - just before the OUTPUT statement.

sagulolo
Quartz | Level 8

Hi Shmuel,

 

Did try below, but it only read 1 record from each txt file. Think the delete stop the current process and went to next txt file

 

filename fn pipe 'dir C:\Users\Desktop\testing\*.txt /s /b';

data new;
infile fn;
input name:&$100.;

infile in filevar=name end=last firstobs=11 missover dsd ;

retain Grp_AC    No	Batch_no	TRX_CODE   Amount	Block_Code ; 

format Grp_AC $char11.
	   Batch_no $char14.
	   Block_Code $char4.
	   No $char16.
	   TRX_CODE $char3.;

do while(not last);

Input  Grp_AC $ 1-24 @;
file_name = scan( scan( name, -1, '\' ), 1, '.' );

if Grp_AC = 'Grp A' or Grp_AC = '' or Grp_AC = 'OVER' or Grp_AC = 'Total Count'  then delete;

input    Grp_AC $ 1-24    No $ 31-48  Batch_no $  52-66   TRX_CODE $ 72-76  @86   Amount  trailsgn16. Block_Code  $ 110-113 ;

output;
end;
run;
Shmuel
Garnet | Level 18

Try next code:

 

filename fn pipe 'dir C:\Users\Desktop\testing\*.txt /s /b';

data new;
infile fn  EOV=Done;  /*** Option EOD added ***/
input name:&$100.;

infile in filevar=name end=last firstobs=11 missover dsd ;

retain Grp_AC    No	Batch_no	TRX_CODE   Amount	Block_Code ; 

format Grp_AC $char11.
	   Batch_no $char14.
	   Block_Code $char4.
	   No $char16.
	   TRX_CODE $char3.;

do while(not last);

Input  Grp_AC $ 1-24 @;
file_name = scan( scan( name, -1, '\' ), 1, '.' );

input    Grp_AC $ 1-24    No $ 31-48  Batch_no $  52-66   TRX_CODE $ 72-76  @86   Amount  trailsgn16. Block_Code  $ 110-113 ;

/*** place of the if changed ***/
if Grp_AC = 'Grp A' or Grp_AC = '' or Grp_AC = 'OVER' or Grp_AC = 'Total Count'  then delete;


output;
end;
run;

 

Tom
Super User Tom
Super User

You don't want to use the DELETE statement since that will end the data step iteration and start from the next file in the input list.

So instead of 

if (condition) then delete;

You want to code 

if not (condition) then output;

So your program might look like this:

data new;
  infile 'dir C:\Users\Desktop\testing\*.txt /s /b' pipe truncover;
  length Grp_AC $11 No $16 Batch_no $14 
         TRX_CODE $3 Amount 8 Block_Code $4 
  ;
  input name $256. ;
  file_name = scan( scan( name, -1, '\' ), 1, '.' );
  infile in filevar=name end=last firstobs=11 truncover  ;
  do while (not last);
    input  Grp_AC $ 1-24 @;
    if  not (Grp_AC in ('Grp A' , ' ', 'OVER', 'Total Count') ) then do;
      input Grp_AC $ 1-24
            No $ 31-48
            Batch_no $  52-66   
            TRX_CODE $ 72-76  
            @86   Amount  trailsgn16. 
            Block_Code  $ 110-113 
      ;
      output;
    end;
  end;
run;

 

 

 

 

 

TomKari
Onyx | Level 15

Reply number one:

 

A lot of managed SAS environments won't let you use the "X" statement, so here's an alternate, really easy way to get all of the datasets in a directory. You just need to add a bit of code to select only the ".txt" datasets.

 

/* Set the directory name to scan */

%let DirectoryName = C:\Users\Desktop\testing;

data DatasetsInADirectory;

keep DirectoryAndDataset;

length DatasetName DirectoryAndDataset $1024 FileRef $8;

/* Assign the fileref */

call missing(FileRef); /* Blank, so SAS will assign a file name */

rc1 = filename(FileRef, "&DirectoryName."); /* Associate the file name with the directory */

if rc1 ^= 0 then

abort;

/* Open the directory for access by SAS */

DirectoryID = dopen(FileRef);

if DirectoryID = 0 then

abort;

/* Get the count of directories and datasets */

MemberCount = dnum(DirectoryID);

if MemberCount = 0 then

abort;

/* Get all of the entry names ... directories and datasets */

do MemberIndex = 1 to MemberCount;

DatasetName = dread(DirectoryID, MemberIndex);

if missing(DatasetName) then

abort;

DirectoryAndDataset = cats("&DirectoryName.","/",DatasetName);

output;

end;

/* Close the directory */

rc2 = dclose(DirectoryID);

if rc2 ^= 0 then

abort;

run;

 

TomKari
Onyx | Level 15

Reply number two

 

Ordinarily, I hesitate to recommend SAS macros to less-experienced users. But I also hate to see "do until" type processing of input datasets.

 

So here's the simplest possible macro which will give you similar results. You'll notice it has the advantage that although there are more steps, each step is extremely simple, and stands alone.

 

It uses the results of my first reply.

 

Tom

 

%macro GetTXTData(DsnToGet);

data temp;

infile "&DsnToGet."; /* You had additional infile options, which would go here */

Input Grp_AC $; /* Again, change as required */

/* additional processing here */

run;

proc datasets lib=work nolist;

append data=temp base=new;

delete temp;

quit;

%mend;

proc datasets lib=work nolist;

delete new;

quit;

data _null_;

length SASMacroCall $1024;

set DatasetsInADirectory;

SASMacroCall = %nrstr(cats("%GetTXTData(", strip(DirectoryAndDataset), ");"));

call execute(SASMacroCall);

run;

sagulolo
Quartz | Level 8

Hi Reeza/shmuel,

 

thank for the code and guidance, i actually have 4 txt file, try below:-

 

filename fn pipe 'dir C:\Users\Desktop\testing\*.txt /s /b';

data new;
infile fn  EOV=Done;  /*** Option EOD added ***/
input name:&$100.;

infile in filevar=name end=last firstobs=11 missover dsd ;

retain Grp_AC    No	Batch_no	TRX_CODE   Amount	Block_Code ; 

format Grp_AC $char11.
	   Batch_no $char14.
	   Block_Code $char4.
	   No $char16.
	   TRX_CODE $char3.;

do while(not last);

Input  Grp_AC $ 1-24 @;
file_name = scan( scan( name, -1, '\' ), 1, '.' );

input    Grp_AC $ 1-24    No $ 31-48  Batch_no $  52-66   TRX_CODE $ 72-76  @86   Amount  trailsgn16. Block_Code  $ 110-113 ;

/*** place of the if changed ***/
if Grp_AC = 'Grp A' or Grp_AC = '' or Grp_AC = 'OVER' or Grp_AC = 'Total Count'  then delete;


output;
end;
run;

only get total 4 record from 4 txt file.

sagulolo
Quartz | Level 8

Hi Tom,

 

Try run the code and it take long time to process(1 hour waited) and still running. I stop the code running and check the log, seem it stuck at 2nd record.

 

data new;
  infile 'dir C:\Users\Desktop\testing\*.txt /s /b' pipe truncover;
  length Grp_AC $11 No $16 Batch_no $14 
         TRX_CODE $3 Amount 8 Block_Code $4 
  ;
  input name $256. ;
  file_name = scan( scan( name, -1, '\' ), 1, '.' );
  infile in filevar=name end=last firstobs=11 truncover  ;
  do while (not last);
    input  Grp_AC $ 1-24 @;
    if  not (Grp_AC in ('Grp A' , ' ', 'OVER', 'Total Count') ) then do;
      input Grp_AC $ 1-24
            No $ 31-48
            Batch_no $  52-66   
            TRX_CODE $ 72-76  
            @86   Amount  trailsgn16. 
            Block_Code  $ 110-113 
      ;
      output;
    end;
  end;
run;

log.png

 

sagulolo
Quartz | Level 8

Hi TomKari,

 

Thank for your code, tried the code and it actually import the data to SAS. Problem is how do I get rid of unwanted row like blank and header.

 

below is those row I don't want:-

 

Grp_AC = 'Grp A'

Grp_AC = ''

Grp_AC = 'OVER'

Grp_AC = 'Total Count'

 

where should I insert those code, assume my code as below or any other alternative.

 

if Grp_AC = 'Grp A' or Grp_AC = '' or Grp_AC = 'OVER' or Grp_AC = 'Total Count'  then delete;

 

TomKari
Onyx | Level 15

That's the advantage to this approach, to make up for having to deal with macros. The "data" step inside the macro is just a normal SAS data step that creates one output SAS record for every input dataset record.

 

So to get rid of unwanted records, all you need to do is this (also, if you haven't already done it, change your "input" statement and associated processing as you had them in your original post".

 

 Input Grp_AC $; /* Again, change as required */

/* additional processing here */
if Grp_AC = 'Grp A' or Grp_AC = '' or Grp_AC = 'OVER' or Grp_AC = 'Total Count'  then delete;

run;

 

Tom

Tom
Super User Tom
Super User

Because you have the INPUT statements for the data files inside of a DO loop you need to move to the next record when you are on one that you want to skip.  Add and an 

else input ;

to the IF/THEN statement.

sagulolo
Quartz | Level 8

Hi Tom,

 

Apologize for late, just wonder if any way to skip that record(unwanted row) instead of insert empty row?

 

thanks

 

Tom
Super User Tom
Super User

@sagulolo wrote:

Hi Tom,

 

Apologize for late, just wonder if any way to skip that record(unwanted row) instead of insert empty row?

 

thanks

 


The code I posted should only output records that meet your criteria. This is because when you include an explicit OUTPUT statement then SAS will not automatically output the observation at the end of the data step like it normally does.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 24 replies
  • 1632 views
  • 0 likes
  • 5 in conversation