DATA Step, Macro, Functions and more

Reading raw files

Accepted Solution Solved
Reply
Regular Contributor
Posts: 212
Accepted Solution

Reading raw files

Hi mates,

 

I need a way to create a three positional input due the following txt file:

  

F   015   3  20151029 019  0100             00000000000000 20151029 102004 20151029                     47,28     1530215000019
591 PAGAMENTO INCLU
019  0100             00000000000000 20151029 102155 20151029                    212,50     1530215000020
.   .
018  0023             00000000000000 20151029 103258 20151029                    538,85     1530215000021

 

The first line will start with the F in the column1, 015 in the column 2 and 3 in the column3, all the other values should match for the first row, but in the third row the "019" should be at the same position of "019" in the first line.

 

To understant what i'm saying, the first textbox is how my raw file is in SAS, the second textbox bellow is represented on how the third row should begin:

  

F   015   3  20151029 019  0100             00000000000000 20151029 102004 20151029                     47,28     1530215000019
591 PAGAMENTO INCLU
                      019  0100             00000000000000 20151029 102155 20151029                    212,50     1530215000020
.   .
                      018  0023             00000000000000 20151029 103258 20151029                    538,85     1530215000021

So what i want is to read the first and the second line and print the into the first line.

 

The third row should begin to read or print the value at the same column of the "019" and put missing to the first 4 columns.

 

The final output i'm trying to make is this:

  

F| 015| 3| 20151029   |   019 | 0100 | 00000000000000| 20151029 |102004| 20151029|   47,28     |1530215000019| 591| PAYMENT
 |    |  |            |   019 | 0100 | 00000000000000| 20151029 |102155| 20151029|  212,50     |1530215000020| 591| PAYMENT
 |    |  |            |   018 | 0023 | 00000012344534| 20151029 |103258| 20151029|  538,85     |1530215000021|    |

 

My code is this:

DATA TEST;
/*	INFILE "/home/re43526/quickview/TI47_*.txt"*/
/*        TRUNCOVER FIRSTOBS=11;*/
	INPUT 	 F1		$1-4
			 F2		$4-8
			 F3		$22-25
			 F4		$26-34
			 F5		$43-58
			 F6		$
			 F7		$
			 F8		$
			 F9		$
			 F10    $115-135
			 #2
			 F11	$1-4
			 F12	$4-50;
	CARDS;
F   015   3  20151029 019  0100             00000000000000 20151029 102004 20151029                     47,28     1530215000019
591 PAYMENT
019  0100             00000000000000 20151029 102155 20151029                    212,50     1530215000020
.   .
018  0023             00000000000000 20151029 103258 20151029                    538,85     1530215000021
;
RUN;

 

But i'm having a lot trouble to make this.

Can you help me ?

 

Thanks in advance


Accepted Solutions
Solution
‎11-03-2015 07:07 AM
PROC Star
Posts: 7,363

Re: Reading raw files

Your data file is different than your first example. There are a number of header records and some of the records contain tabs instead of spaces.

 

Regardless, I think the following either works or comes extremely close:

 

DATA need;
INFILE "/folders/myfolders/test.txt" truncover;
informat s_fil $20.;
input @;
if strip(_infile_) eq '' then do;
call missing(s_fil);
input;
end;
else if substr(_infile_,1,3) in ('S.A','ENT','FLL','---','STA') or
substr(_infile_,24,4) eq 'SERV' then do;
call missing(s_fil);
input;
end;
else if anyalpha(substr(_infile_,26,5)) then do;
input s_fil &;
output;
end;
else if substr(_infile_,1,1) eq '09'x then do;
_infile_=compress(_infile_,'09'x);
_infile_=cat(' ',strip(_infile_));
INPUT agda_num $24-26
serv_msg_num $29-32
con $37-44
ret $46-59
dt2 $61-68
txn $70-75
hr_txn_dt $77-84
pgto_pc $86-87
valor $100-110
age_tax_enrlmt_nbr $116-128
;
output;
end;

else do;
INPUT sta $1-4
cal $6-8
sg $12-12
dt1 $15-22
agda_num $24-26
serv_msg_num $29-32
con $37-44
ret $46-59
dt2 $61-68
txn $70-75
hr_txn_dt $77-84
pgto_pc $86-87
valor $100-110
age_tax_enrlmt_nbr $116-128
;
output;
end;
run;
data want (drop=nextSmiley Happy;
recnum=_n_+1;
set need end=last;
if not last and missing(s_fil) then do;
set need (keep=s_fil rename=(s_fil=next_s_fil)) point=recnum;
if not missing(next_s_fil) then s_fil=next_s_fil;
output;
end;
else if last and missing(s_fil) then output;
run;

 

View solution in original post


All Replies
Regular Contributor
Posts: 161

Re: Reading raw files

Does the sequence repeat?   Is there only one "F" record in the file or for every 3 records there is a F record followed by an address record followed by a shifted "018" or "019" record ?

 

Would like to see some more data.  With limited infomation, should be doable.

Kannan Deivasigamani
Regular Contributor
Posts: 212

Re: Reading raw files

[ Edited ]

 

F   003   3  20150515 002  0001     1467190 00000000000000 20150515 123928 20150515            237.925.559,51     0762813002413
172 PAYMENT
002  0001     1467190 00000000000000 20150515 123929 20150515             38.271.480,73     0762813002414
172 PAYMENT
002  0001     1470370 00000000000000 20150515 143314 20150515                254.311,29     0762813002417
F   011   1  20150323 005  0031    31073808 00000581701860 20150323 074917 20150323 03                 411,43     0000000000000
.   .
005  0031    31080804 00000826413048 20150323 085940 20150323 03                 244,42     0000000000000
.   .

 

Actualy I need make just like an IF condition.

The point is:  "

F   003   3  20150515 002

This part the "002" is in the position 21 or 22, but the third line also should be at the same position, but it isn't,

begins at the first position. 

PROC Star
Posts: 7,363

Re: Reading raw files

Not sure if this is exactly what you want, but should be close enough that you can adjust as needed:

 

DATA TEST;
/* INFILE "/home/re43526/quickview/TI47_*.txt"*/
/* TRUNCOVER FIRSTOBS=11;*/
infile cards truncover;
input @;
if mod(_n_,3) eq 1 then
INPUT F1 $1-4
F2 $5-7
F3 $8-11
F4 $14-21
F5 $22-25
F6 $26-31
F7 $43-58
F8 $60-67
F9 $68-74
F10 $76-83
F11 $100-109
F12 $115-135
#2
F13 $1-4
F14 $4-50;
else if mod(_n_,3) eq 2 then
INPUT F5 $1-4
F6 $6-11
F7 $22-37
F8 $38-45
F9 $47-52
F10 $54-61
F11 $72-87
F12 $90-105
#2
F13 $1-4
F14 $4-50;
else
INPUT F5 $1-4
F6 $6-11
F7 $22-37
F8 $38-45
F9 $47-52
F10 $54-61
F11 $72-87
F12 $90-105
;
CARDS;
F 015 3 20151029 019 0100 00000000000000 20151029 102004 20151029 47,28 1530215000019
591 PAYMENT
019 0100 00000000000000 20151029 102155 20151029 212,50 1530215000020
. .
018 0023 00000000000000 20151029 103258 20151029 538,85 1530215000021
;
RUN;

 

Regular Contributor
Posts: 212

Re: Reading raw files

[ Edited ]

Art, thanks this kind i solved, but when use infile "with my txt file", there are some lines that i don't need to read.

Check out the attachment to see what i'm talking about.

 

Solution
‎11-03-2015 07:07 AM
PROC Star
Posts: 7,363

Re: Reading raw files

Your data file is different than your first example. There are a number of header records and some of the records contain tabs instead of spaces.

 

Regardless, I think the following either works or comes extremely close:

 

DATA need;
INFILE "/folders/myfolders/test.txt" truncover;
informat s_fil $20.;
input @;
if strip(_infile_) eq '' then do;
call missing(s_fil);
input;
end;
else if substr(_infile_,1,3) in ('S.A','ENT','FLL','---','STA') or
substr(_infile_,24,4) eq 'SERV' then do;
call missing(s_fil);
input;
end;
else if anyalpha(substr(_infile_,26,5)) then do;
input s_fil &;
output;
end;
else if substr(_infile_,1,1) eq '09'x then do;
_infile_=compress(_infile_,'09'x);
_infile_=cat(' ',strip(_infile_));
INPUT agda_num $24-26
serv_msg_num $29-32
con $37-44
ret $46-59
dt2 $61-68
txn $70-75
hr_txn_dt $77-84
pgto_pc $86-87
valor $100-110
age_tax_enrlmt_nbr $116-128
;
output;
end;

else do;
INPUT sta $1-4
cal $6-8
sg $12-12
dt1 $15-22
agda_num $24-26
serv_msg_num $29-32
con $37-44
ret $46-59
dt2 $61-68
txn $70-75
hr_txn_dt $77-84
pgto_pc $86-87
valor $100-110
age_tax_enrlmt_nbr $116-128
;
output;
end;
run;
data want (drop=nextSmiley Happy;
recnum=_n_+1;
set need end=last;
if not last and missing(s_fil) then do;
set need (keep=s_fil rename=(s_fil=next_s_fil)) point=recnum;
if not missing(next_s_fil) then s_fil=next_s_fil;
output;
end;
else if last and missing(s_fil) then output;
run;

 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 415 views
  • 0 likes
  • 3 in conversation