BookmarkSubscribeRSS Feed
somu19tec
Calcite | Level 5

Hi All, 

 

I am trying to read a file.

I have to read the detail records, only if the trailer has the err_code as blank.

Please find the file details below.

 

data Header(keep=Filename Date ) trailer(keep=Record_Count Err_Code ) ;
infile cards ;
input record_type $2.@;
if record_type ='HH' then do;
input
@4 Filename : $21.
@26Date :$8.;
output Header ;
end;

if record_type ='FF' then do;
input
@4 Record_Count $16.
@21Err_Code $16.;

output trailer;
end;


cards;
HH|PU_AH_ICA_STUD_IN.TXT|20200206
S1000001D2012YYYYYY
S2000002D2010YNYNYN
S3000101D2018YNNNNN
S4000102D2099NNNNNN
S5000103D2010NYNYNY
S6ZZZ104D2012NNNNNN
S7000201E NNNNNNERR00000000021
S8000202EABCDNNNNNNERR00000000021
S9!@#301E2018NNNNNNERR00000000011
S10@#401EZZZZNNNNNNERR00000000011
FF|0000000000000010|
;
run;

 

any help on this is appreciated.

 

Thanks!!

Soma

 

 

7 REPLIES 7
ed_sas_member
Meteorite | Level 14

Hi @somu19tec 

If I understand well, you want to output 2 tables from the input data:

- 'header' -> with info related to records starting with 'HH'

- 'trailer' -> with info related to records starting with 'FF' and Err_code not missing

Records beginning with 'S' will not be imported. Is that right?

 

In this case, you can just add the IF condition in this part of your code:

	if record_type='FF' then do;
			input
@4 Record_Count $16.
@21Err_Code $16.;
			if nmiss(Err_Code) then output trailer;
	end;

Hope this help,

Best,

 

 

somu19tec
Calcite | Level 5

Hi @ed_sas_member ,

 

I wanted to create another dataset called Details which should have records other than HH and FF and also satisfy the condition that err_code is blank from the trailer record.

 

Basically i wanted to create the dataset details based on the err_code field in the trailer record.

 

Please let me know if you need further information.

 

Thanks!!

Soma

ed_sas_member
Meteorite | Level 14

Hi @somu19tec 

 

Thank you for the clarification and sorry for the confusion!

I think you can try to insert this code :

	if record_type not in('HH','FF') then do;
		input details $ 1-34 @;
		if find(details,'ERR')=0 then output details ;
	end;

 

So it will result in:

data Header (keep=Filename Date) trailer (keep=Record_Count Err_Code) details (keep=details);
	infile cards;
	input record_type $2.@;

	if record_type='HH' then do;
			input
    @4 Filename $21.
    @26 Date yymmdd8.;
			output Header;
	end;

	if record_type='FF' then do;
			input
    @4 Record_Count 16.
    @21 Err_Code $16.;
			output trailer;
	end;
	
	if record_type not in('HH','FF') then do;
		input details $ 1-34 @;
		if find(details,'ERR')=0 then output details ;
	end;
	
	format date yymmddd10.;
	cards;
HH|PU_AH_ICA_STUD_IN.TXT|20200206
S1000001D2012YYYYYY
S2000002D2010YNYNYN
S3000101D2018YNNNNN
S4000102D2099NNNNNN
S5000103D2010NYNYNY
S6ZZZ104D2012NNNNNN
S7000201E NNNNNNERR00000000021
S8000202EABCDNNNNNNERR00000000021
S9!@#301E2018NNNNNNERR00000000011
S10@#401EZZZZNNNNNNERR00000000011
FF|0000000000000010|
;

 

 

Hope this help.

 

Best,

somu19tec
Calcite | Level 5

Hi @ed_sas_member ,

 

The if condition should check the err_code field read from the trailer record and then it has to output the details dataset.

 

if find(details,'ERR')=0 then output details ;

 Please let me know if you need any details.

 

Thanks!!

Soma

Kurt_Bremser
Super User

Forget my previous post, see this instead:

Based on your description, we will need two steps for the details, as the determining row is read after the block.

I also included the sorts necessary for the join, and date with an error code.

data
  Header (keep=Filename Date)
  trailer (keep=filename date Record_Count Err_Code)
  _details (keep=filename date details)
;
infile cards;
input record_type $2.@;
retain filename date;
if record_type = 'HH' then do;
  input
    @4 Filename $21.
    @26 Date yymmdd8.
  ;
  output header;
end;
else if record_type = 'FF' then do;
  input
    @4 Record_Count 16.
    @21 Err_Code $16.
  ;
  output trailer;
end;
else do;
  input details $50.;
  output _details;
end;
format date yymmddd10.;
cards;
HH|PU_AH_ICA_STUD_IN.TXT|20200206
S1000001D2012YYYYYY
S2000002D2010YNYNYN
S3000101D2018YNNNNN
S4000102D2099NNNNNN
S5000103D2010NYNYNY
S6ZZZ104D2012NNNNNN
S7000201E NNNNNNERR00000000021
S8000202EABCDNNNNNNERR00000000021
S9!@#301E2018NNNNNNERR00000000011
S10@#401EZZZZNNNNNNERR00000000011
FF|0000000000000010|
HH|XX_AH_ICA_STUD_IN.TXT|20200206
S1000001D2012YYYYYY
S2000002D2010YNYNYN
S3000101D2018YNNNNN
S4000102D2099NNNNNN
S5000103D2010NYNYNY
S6ZZZ104D2012NNNNNN
S7000201E NNNNNNERR00000000021
S8000202EABCDNNNNNNERR00000000021
S9!@#301E2018NNNNNNERR00000000011
S10@#401EZZZZNNNNNNERR00000000011
FF|0000000000000010|xxxx
;

proc sort data=header;
by filename date;
run;

proc sort data=trailer;
by filename date;
run;

proc sort data=_details;
by filename date;
run;

data details;
merge
  _details (in=dt)
  trailer (in=tr drop=record_count where=(err_code = ''))
;
by filename date;
if dt and tr;
drop err_code;
run;
Kurt_Bremser
Super User

Use consistent code formatting, read dates as dates, numbers as numbers, and add the appropriate condition:

data
  Header (keep=Filename Date)
  trailer (keep=Record_Count Err_Code)
;
infile cards;
input record_type $2.@;
if record_type = 'HH' then do;
  input
    @4 Filename $21.
    @26 Date yymmdd8.
  ;
  output Header;
end;
if record_type = 'FF' then do;
  input
    @4 Record_Count 16.
    @21 Err_Code $16.
  ;
  if err_code = '' then output trailer;
end;
format date yymmddd10.;
cards;
HH|PU_AH_ICA_STUD_IN.TXT|20200206
S1000001D2012YYYYYY
S2000002D2010YNYNYN
S3000101D2018YNNNNN
S4000102D2099NNNNNN
S5000103D2010NYNYNY
S6ZZZ104D2012NNNNNN
S7000201E NNNNNNERR00000000021
S8000202EABCDNNNNNNERR00000000021
S9!@#301E2018NNNNNNERR00000000011
S10@#401EZZZZNNNNNNERR00000000011
FF|0000000000000010|
;
ballardw
Super User

@somu19tec wrote:

Hi All, 

 

I am trying to read a file.

I have to read the detail records, only if the trailer has the err_code as blank.

Please find the file details below.

 

Please explicitly define:

Header

Trailer

How we know the err_code is blank.

 

We have to guess from your code that HH is a header and FF is the trailer.

 

Some other details: is there ALWAYS a Header and Trailer pair? or could there be multiple Trailers for a given Header?

 

And if the Header and the Trailer should have some relation what is in the two output data sets to establish/maintain a link between the two? If one of the sets gets sorted or subsetted for any reason how do get related trailer info if needed?

 

And you say you "have to read the detail records". Which again we have to assume are those that are starting with S without any description about how you need to read them. Which may have a bit of a trick as it appears to have a not mentioned sequence of S1 to S10. S1 through S5 may have the same structure but what do you read them into? And S6 might have the same structure but if ZZZ means something other that the '000' that appear in the same positions it might be a good idea to share. As well as how you expect to read the S7 to S10. And are there always exactly the same number of rows of data between the header and trailer.? That might give us a few more options to suggest.

For example if there are ALWAYS exactly 12 lines of data, a header, 10 details and 1 trailer you could read them with something like this:

 

data junk;
   infile datalines n=12 truncover;
   input #1 record_type $2.
           @4 Filename : $21.
         @26Date :yymmdd8.

        #12 @4 Record_Count $16.
            @21 Err_Code $16.
       #1 @
   ;
   if missing(err_code) then
      input #2 s1 :$20.
            #3 s2 :$20.
            #4 s3 :$20.
            #5 s4 :$20.
            #6 s5 :$20.
            #7 s6 :$20.
            #12
 ;
datalines;
HH|PU_AH_ICA_STUD_IN.TXT|20200206
S1000001D2012YYYYYY
S2000002D2010YNYNYN
S3000101D2018YNNNNN
S4000102D2099NNNNNN
S5000103D2010NYNYNY
S6ZZZ104D2012NNNNNN
S7000201E NNNNNNERR00000000021
S8000202EABCDNNNNNNERR00000000021
S9!@#301E2018NNNNNNERR00000000011
S10@#401EZZZZNNNNNNERR00000000011
FF|0000000000000010|
HH|xxxxxxxxxxxxxxxxx.TXT|20200206
S1000001D2012YYYYYY
S2000002D2010YNYNYN
S3000101D2018YNNNNN
S4000102D2099NNNNNN
S5000103D2010NYNYNY
S6ZZZ104D2012NNNNNN
S7000201E NNNNNNERR00000000021
S8000202EABCDNNNNNNERR00000000021
S9!@#301E2018NNNNNNERR00000000011
S10@#401EZZZZNNNNNNERR00000000011
FF|0000000000000010|
;
run;

Note that at this point I am not splitting any records out as that is trivial and basically to show that many more details of your problem are needed.

If this is similar than each of how to read the details would follow the corresponding S1. I didn't read all, just showing proof of principal.

The N option on the INFILE is how many physical lines are available to the input buffer , which could be set the maximum number of lines (if known) a single record should take. The # beginning lines of the INPUT says "read from line number".

If the number of lines varies you can loop through to the maximum n with Input #var to search for FF which would give you some ideas of the complexity but possibility.

 

Note that the last #12 in the INPUT statement after the missing error code is found is to set the input point to the end of the record (the one with the FF. So the NEXT input will start reading from the line after that.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1375 views
  • 0 likes
  • 4 in conversation