I am trying to read the data from .txt external file through SAS Code,it has '|' as a column delimiter and '<' as a row delimiter.I am not able to read the data in proper way,I am using below mentioned code to read the data from the server.Data is coming in as a data stream without any end of line indicators like CRLF.I tried all the options FLOWOVER,MISSOVER and TRUCOVER but i am not getting the desired result.Can someone let me know how to read the row delimiter here ? Thanks for your help.
DATA WORK.PURCHASE;
LENGTH
Filer_TIN $ 11
Ret_Type $ 1
Period_Key 8
TAXABLE_GOODS_VAL 8
AMT_TAX_VAL 8
INCL_TAX_VAL 8
COMP_VAL 8
TAX_FRE_SALES_VAL 8
EXEM_SALES_VAL 8
LABOUR_CHRGES_VAL 8
OTHER_CHRGES_VAL 8
GROSS_TOTAL_VAL 8
FORM_NUM 8
TYPE_OF_TX 8
Partner_TIN $ 11;
FORMAT
Filer_TIN $CHAR11.
Ret_Type $CHAR1.
Period_Key BEST4.
TAXABLE_GOODS_VAL BEST11.
AMT_TAX_VAL BEST9.
INCL_TAX_VAL BEST10.
COMP_VAL BEST9.
TAX_FRE_SALES_VAL BEST10.
EXEM_SALES_VAL BEST8.
LABOUR_CHRGES_VAL BEST10.
OTHER_CHRGES_VAL BEST9.
GROSS_TOTAL_VAL BEST11.
FORM_NUM BEST3.
TYPE_OF_TX BEST2.
Partner_TIN $CHAR11.;
INFORMAT
Filer_TIN $CHAR11.
Ret_Type $CHAR1.
Period_Key BEST4.
TAXABLE_GOODS_VAL BEST11.
AMT_TAX_VAL BEST9.
INCL_TAX_VAL BEST10.
COMP_VAL BEST9.
TAX_FRE_SALES_VAL BEST10.
EXEM_SALES_VAL BEST8.
LABOUR_CHRGES_VAL BEST10.
OTHER_CHRGES_VAL BEST9.
GROSS_TOTAL_VAL BEST11.
FORM_NUM BEST3.
TYPE_OF_TX BEST2.
Partner_TIN $CHAR11.;
INFILE '/sas/EIUData/PURCHASE_ANNEX.txt'
LRECL=102
ENCODING="UTF-8"
DLMSTR="|" truncover;
INPUT
Filer_TIN : $CHAR11.
Ret_Type : $CHAR1.
Period_Key : ?? BEST4.
TAXABLE_GOODS_VAL : ?? COMMA11.
AMT_TAX_VAL : ?? COMMA9.
INCL_TAX_VAL : ?? COMMA10.
COMP_VAL : ?? COMMA9.
TAX_FRE_SALES_VAL : ?? COMMA10.
EXEM_SALES_VAL : ?? COMMA8.
LABOUR_CHRGES_VAL : ?? COMMA10.
OTHER_CHRGES_VAL : ?? COMMA9.
GROSS_TOTAL_VAL : ?? COMMA11.
FORM_NUM : ?? BEST3.
TYPE_OF_TX : ?? BEST2.
Partner_TIN : $CHAR11. ;
RUN;
Sample Data :
27020018076 |R|1606|5045.48|277.48|0.00|0.00|0.00|0.00|0.00|0.00|5322.96|231 |10 |27080058880 |<27020018076 |R|1606|947.87|52.13|0.00|0.00|0.00|0.00|
0.00|0.00|1000.00|231 |10 |27131130266 |<27020018076 |R|1606|1848.34|101.66|0.00|0.00|0.00|0.00|0.00|0.00|1950.00|231 |10 |27220036105
|<27020018076 |R|1606|8597.17|472.85|0.00|0.00|0.00|0.00|0.00|0.00|9070.02|231 |10 |27220646138 |<27020018076 |R|1606|22800.00|1360.00|0.00|
0.00|0.00|0.00|0.00|0.00|24160.00|231 |10 |27240041138 |<27020018076 |R|1606|67096.22|4025.02|0.00|0.00|0.00|0.00|0.00|0.00|71121.24|231 |10
|27260645728 |<27020018076 |R|1606|6493.12|357.12|0.00|0.00|0.00|0.00|0.00|0.00|6850.24|231 |10 |27270301033 |<27020018076 |R|1606|945001.00|
52605.00|0.00|0.00|0.00|0.00|0.00|0.00|997606.00|231 |10 |27290000710 |<27020018076 |R|1606|3981.04|218.96|0.00|0.00|0.00|0.00|0.00|0.00|4200.00|
231 |10 |27290279682 |<27020018076 |R|1606|2800.00|168.00|0.00|0.00|0.00|0.00|0.00|0.00|2968.00|231 |10 |27361171219 |<27020018076 |R|1606|
59052.49|3247.87|0.00|0.00|0.00|0.00|0.00|0.00|62300.36|231 |10 |27390019279 |<27020018076 |R|1606|8583163.22|514989.78|0.00|0.00|0.00|0.00|
0.00|0.00|9098153.00|231 |10 |27460293314 |<27020018076 |R|1606|440662.27|26439.73|0.00|0.00|0.00|0.00|0.00|0.00|467102.00|231 |10 |
27490371140 |<27020018076 |R|1606|4146.96|228.08|0.00|0.00|0.00|0.00|0.00|0.00|4375.04|231 |10 |27530294100 |<27020018076 |R|1606|54424.24|
7347.24|0.00|0.00|0.00|0.00|0.00|0.00|61771.48|231 |10 |27610060189 |<27020018076 |R|1606|3500.50|192.50|0.00|0.00|0.00|0.00|0.00|0.00|3693.00|
231 |10 |27650269493 |<27020018076 |R|1606|471985.10|28319.10|0.00|0.00|0.00|0.00|0.00|0.00|500304.20|231 |10 |27670990871 |<27020018076 |
R|1606|2853.81|171.23|0.00|0.00|0.00|0.00|0.00|0.00|3025.04|231 |10 |27690241729 |<27020018076 |R|1606|642012.00|38970.74|0.00|0.00|0.00|0.00|
Kindly find the attached sample data for refrence.
Read it using RECFM=N or S and use both | and < as delimiters. That should work as long as all "lines" have all of the variables.
The problem is twofold:
- SAS only accepts CR, LF or CRLF as line separators in variable-length text files.
- your file already contains spurious CRLF sequences
So one would have to first remove the CRLF present, and then replace the "<" with either a single LF or CRLF.
On UNIX I would try to use the tr filter, but I have no idea if that would work.
Why not just pre-process the messed up file and make a normal file out of it?
This step seems to do the job for your example file.
data _null_;
infile tmpfile1 recfm=n dsd dlm='|<';
file tmpfile2 dsd dlm='|';
length v1-v16 $100 ;
input v1-v16 ;
array v v1-v16; do over v; v=translate(v,' ','0D0A'x); end;
put v1-v16;
run;
076|R|1606|5045.48|277.48|0.00|0.00|0.00|0.00|0.00|0.00|5322.96|231|10|27080058880| 076|R|1606|947.87|52.13|0.00|0.00|0.00|0.00|0.00|0.00|1000.00|231|10|27131130266| 076|R|1606|1848.34|101.66|0.00|0.00|0.00|0.00|0.00|0.00|1950.00|231|10|27220036105| 076|R|1606|8597.17|472.85|0.00|0.00|0.00|0.00|0.00|0.00|9070.02|231|10|27220646138| 076|R|1606|22800.00|1360.00|0.00|0.00|0.00|0.00|0.00|0.00|24160.00|231|10|27240041138| 076|R|1606|67096.22|4025.02|0.00|0.00|0.00|0.00|0.00|0.00|71121.24|231|10|27260645728| 076|R|1606|6493.12|357.12|0.00|0.00|0.00|0.00|0.00|0.00|6850.24|231|10|27270301033| 076|R|1606|945001.00|52605.00|0.00|0.00|0.00|0.00|0.00|0.00|997606.00|231|10|27290000710| 076|R|1606|3981.04|218.96|0.00|0.00|0.00|0.00|0.00|0.00|4200.00|231|10|27290279682| 076|R|1606|2800.00|168.00|0.00|0.00|0.00|0.00|0.00|0.00|2968.00|231|10|27361171219| 076|R|1606|59052.49|3247.87|0.00|0.00|0.00|0.00|0.00|0.00|62300.36|231|10|27390019279| 076|R|1606|8583163.22|514989.78|0.00|0.00|0.00|0.00|0.00|0.00|9098153.00|231|10|27460293314| 076|R|1606|440662.27|26439.73|0.00|0.00|0.00|0.00|0.00|0.00|467102.00|231|10|27490371140| 076|R|1606|4146.96|228.08|0.00|0.00|0.00|0.00|0.00|0.00|4375.04|231|10|27530294100| 076|R|1606|54424.24|7347.24|0.00|0.00|0.00|0.00|0.00|0.00|61771.48|231|10|27610060189| 076|R|1606|3500.50|192.50|0.00|0.00|0.00|0.00|0.00|0.00|3693.00|231|10|27650269493| 076|R|1606|471985.10|28319.10|0.00|0.00|0.00|0.00|0.00|0.00|500304.20|231|10|27670990871| 076|R|1606|2853.81|171.23|0.00|0.00|0.00|0.00|0.00|0.00|3025.04|231|10|27690241729|
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.