Hello,
I am attempting to extract packed decimal data from an IBM mainframe. The data as we received is not valid packed decimal but rather appears to have been translated from EBCDIC to ASCII. Is there a built-in format or function in SAS that will translate ASCII data to EBCDIC? If so, is it possible to specify the which translation table should be used? By converting the ASCII codes to EBCDIC in the ISO_1 ASCII-to-EBCDIC translation table on this page I am able to parse out the correct packed decimal data.
For example, a test case I have has a true value of 84.47 (or 08447C as packed decimal). The code below creates a test file containing the true value and the data as we receive it (hex value 97 E0 40). Following the table linked above, those coordinates correspond to the true values 08 44 7C. I am looking for a method that will convert the observed hex values to the correct decimal values. I've gotten close using this combination of hex, ebcdic formats and the byte function, but upon running this converts the "97" hex value to "00" when I am looking for "08". I am looking to see if anyone knows of a simpler and correct way to do this.
filename pd 'C:\pdexample.txt'; data _null_; file 'C:\pdexample.txt'; true=84.47; observed = 99533.44; put true S370FPD6.2; put observed S370FIB6.2; run; data testcase; infile 'C:\pdexample.txt'; input @1 val_Raw $6. @1 val_FIB S370FIB6.0 ; run; data testcase2; set testcase; val_Hex = put(val_FIB,$hex.); array byte_ {4} $ B1 B2 B3 B4; array pdconv {4} $ PD1 PD2 PD3 PD4; do i=1 to dim(byte_); byte_{i} = substr(val_hex,i*2-1,2); pdconv{i} = put(put(byte(input(byte_{i},hex.)),$ebcdic.),$hex.); end; val_pd = cats(PD1,PD2,PD3,PD4); if substr(val_pd,length(val_pd),1)="C" then val = substr(val_pd,1,length(val_pd)-1)/100; if substr(val_pd,length(val_pd),1)="D" then val = substr(val_pd,1,length(val_pd)-1)/-100; run;
I hope I've made the issue evident but I am happy to clarify anything, as I am not entirely sure the best question to ask or how to ask it. Any guidance is appreciated!
I am running SAS 9.4TSM6 on Windows Server 2016.
Thank you,
Brian
You could try using the translation tables the KCVT() function uses:
data want;
set have;
WANT = kcvt(HAVE,'ascii','ebcdic');
run;
Here is what those tables look like:
data test;
do decimal=0 to 255 ;
hex=put(decimal,hex2.);
ascii = byte(decimal);
ebcdic = kcvt(ascii,'ascii','ebcdic');
ascii2 = kcvt(ebcdic,'ebcdic','ascii');
output;
end;
format ascii ebcdic ascii2 $hex2. ;
run;
data _null_;
put ' _X ASCII - EBCDIC'
/'X_ ' @
;
do low=0 to 15 ;
put ' ' low hex1. @;
end;
put;
do high=0 to 15 ;
put ' ' high hex1. ' ' @;
do low=0 to 15 ;
set test;
put ' ' ebcdic $hex2. @ ;
end;
put;
end;
stop;
run;
proc sort data=test out=test2; by ebcdic ; run;
data _null_;
put ' _X EBCDIC - ASCII'
/'X_ ' @
;
do low=0 to 15 ;
put ' ' low hex1. @;
end;
put;
do high=0 to 15 ;
put ' ' high hex1. ' ' @;
do low=0 to 15 ;
set test2;
put ' ' ascii $hex2. @ ;
end;
put;
end;
stop;
run;
_X ASCII - EBCDIC X_ 0 1 2 3 4 5 6 7 8 9 A B C D E F 0 00 01 02 03 37 2D 2E 2F 16 05 25 0B 0C 0D 0E 0F 1 10 11 12 13 3C 3D 32 26 18 19 3F 27 1C 1D 1E 1F 2 40 5A 7F 7B 5B 6C 50 7D 4D 5D 5C 4E 6B 60 4B 61 3 F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 7A 5E 4C 7E 6E 6F 4 7C C1 C2 C3 C4 C5 C6 C7 C8 C9 D1 D2 D3 D4 D5 D6 5 D7 D8 D9 E2 E3 E4 E5 E6 E7 E8 E9 AD E0 BD 5F 6D 6 79 81 82 83 84 85 86 87 88 89 91 92 93 94 95 96 7 97 98 99 A2 A3 A4 A5 A6 A7 A8 A9 C0 4F D0 A1 07 8 20 21 22 23 24 15 06 17 28 29 2A 2B 2C 09 0A 1B 9 30 31 1A 33 34 35 36 08 38 39 3A 3B 04 14 3E FF A 41 42 43 44 45 46 47 48 49 4A 51 52 53 54 55 56 B 57 58 59 62 63 64 65 66 67 68 69 6A 70 71 72 73 C 74 75 76 77 78 80 8A 8B 8C 8D 8E 8F 90 9A 9B 9C D 9D 9E 9F A0 AA AB AC AE AF B0 B1 B2 B3 B4 B5 B6 E B7 B8 B9 BA BB BC BE BF CA CB CC CD CE CF DA DB F DC DD DE DF E1 EA EB EC ED EE EF FA FB FC FD FE _X EBCDIC - ASCII X_ 0 1 2 3 4 5 6 7 8 9 A B C D E F 0 00 01 02 03 9C 09 86 7F 97 8D 8E 0B 0C 0D 0E 0F 1 10 11 12 13 9D 85 08 87 18 19 92 8F 1C 1D 1E 1F 2 80 81 82 83 84 0A 17 1B 88 89 8A 8B 8C 05 06 07 3 90 91 16 93 94 95 96 04 98 99 9A 9B 14 15 9E 1A 4 20 A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 2E 3C 28 2B 7C 5 26 AA AB AC AD AE AF B0 B1 B2 21 24 2A 29 3B 5E 6 2D 2F B3 B4 B5 B6 B7 B8 B9 BA BB 2C 25 5F 3E 3F 7 BC BD BE BF C0 C1 C2 C3 C4 60 3A 23 40 27 3D 22 8 C5 61 62 63 64 65 66 67 68 69 C6 C7 C8 C9 CA CB 9 CC 6A 6B 6C 6D 6E 6F 70 71 72 CD CE CF D0 D1 D2 A D3 7E 73 74 75 76 77 78 79 7A D4 D5 D6 5B D7 D8 B D9 DA DB DC DD DE DF E0 E1 E2 E3 E4 E5 5D E6 E7 C 7B 41 42 43 44 45 46 47 48 49 E8 E9 EA EB EC ED D 7D 4A 4B 4C 4D 4E 4F 50 51 52 EE EF F0 F1 F2 F3 E 5C F4 53 54 55 56 57 58 59 5A F5 F6 F7 F8 F9 FA F 30 31 32 33 34 35 36 37 38 39 FB FC FD FE FF 9F
Typically IBM mainframe data copied to an ASCII computer is best transferred in binary format. Then you use the S370 SAS informats to translate the data from native EBCDIC into ASCII form. What is the problem with getting another binary transfer? Any other approach would be a pain in the proverbial as I think you are discovering.
Are you saying someone/something tried to translate the binary bytes like '44'x as if it was an EBCDIC code?
That code has no meaning in EBCDIC.
Thanks for the response. I'm not entirely sure what translation took place, but I do know that when I view the source data of purported packed decimal, I see invalid PD data (e.g. the "44" is coming in to our system as "E0"). As far as I can tell, packed decimal data is getting converted to ASCII following the "ISO-1" convention of EBCDIC to ASCII:
If I can back-out the translation using the above table (or rather, ASCII to EBCDIC) then I can reconstruct the valid packed decimal data. Is there a function or format in SAS to convert ASCII to EBCDIC codes following a convention such as the one above?
AFAIK, SAS doesn't have an easy way to undo an EBCDIC to ASCII translation. Resending the mainframe file using FTP with the BINARY option seems like a no brainer to me - SAS has functionality to do that.
Yes, correct. I've used this approach myself in the dim, distant past and it works really well.
You can connect directly to the FTP server on the mainframe from SAS, using the proper options. This means that you need to know the exact name of the file on the MF, and its organization (FB, VB, ...)
If it is FB, you read with RECFM=N and the exact LRECL as defined, if it is VB, see here.
Then you read the file using MF-informats ($EBCDIC, S370FPD and its brethren).
If you can convert that picture into data then you could just use the TRANSLATE() function.
data want;
set have;
ascii = translate(ebcdic,collate(0,255),'000102039c.....'x);
run;
But that assumes you know what mapping was used and that it was 1-1 (or at least 1-1 for any possible byte that might appear in the PD field).
Thank you for suggesting this approach. Assuming this is the correct translation (and if it's 1-1) I tried implementing on a test case that only used one row of the data table. Here I'm trying to import the decimal value "8". (I get the "observed" value of 1.51 by converting "97" to hex=151, and 97 in turn is derived from that table linked above with coordinates "08". A hex view of the data confirms.)
filename pd 'C:\pdexample2.txt';
data _null_;
file 'C:\pdexample2.txt';
true=8;
observed = 1.51;
put true S370FPD6.2;
put observed S370FIB6.2;
run;
data testcase;
infile 'C:\pdexample2.txt';
input
@1 val_Raw $6.
@1 val_FIB S370FIB6.0
;
run;
data want;
set testcase;
ascii = translate(val_Raw,collate(0,15),'000102039C09867F978D8E0B0C0D0E0F'x);
run;
When I run this code, I am not sure how to use the result (I am mainly looking at row 2, since that is the data how we observe it):
Can you clarify what I would do with the result? Does this need to be translated again using another function or format?
Once you get the text string converted back to the original binary text you still need to use one of the S3270 informats to convert the string into the number it represents.
Note that for the S370FPD informat the width is the number of bytes in the string to read and the decimal is where to place the implied decimal point. So it is possible to have the decimal value be larger than the width value, like S370FPD6.7, since the number of decimal digits is two times the width.
data test;
length badtext goodtext $3;
format badtext goodtext $hex6.;
badtext='97E040'x;
goodtext = translate(badtext,'08447C'x, '97E040'x);
want=84.47;
number = input(goodtext,S370FPD3.2);
put (_all_) (=/);
run;
badtext=97E040 goodtext=08447C want=84.47 number=84.47
You could try using the translation tables the KCVT() function uses:
data want;
set have;
WANT = kcvt(HAVE,'ascii','ebcdic');
run;
Here is what those tables look like:
data test;
do decimal=0 to 255 ;
hex=put(decimal,hex2.);
ascii = byte(decimal);
ebcdic = kcvt(ascii,'ascii','ebcdic');
ascii2 = kcvt(ebcdic,'ebcdic','ascii');
output;
end;
format ascii ebcdic ascii2 $hex2. ;
run;
data _null_;
put ' _X ASCII - EBCDIC'
/'X_ ' @
;
do low=0 to 15 ;
put ' ' low hex1. @;
end;
put;
do high=0 to 15 ;
put ' ' high hex1. ' ' @;
do low=0 to 15 ;
set test;
put ' ' ebcdic $hex2. @ ;
end;
put;
end;
stop;
run;
proc sort data=test out=test2; by ebcdic ; run;
data _null_;
put ' _X EBCDIC - ASCII'
/'X_ ' @
;
do low=0 to 15 ;
put ' ' low hex1. @;
end;
put;
do high=0 to 15 ;
put ' ' high hex1. ' ' @;
do low=0 to 15 ;
set test2;
put ' ' ascii $hex2. @ ;
end;
put;
end;
stop;
run;
_X ASCII - EBCDIC X_ 0 1 2 3 4 5 6 7 8 9 A B C D E F 0 00 01 02 03 37 2D 2E 2F 16 05 25 0B 0C 0D 0E 0F 1 10 11 12 13 3C 3D 32 26 18 19 3F 27 1C 1D 1E 1F 2 40 5A 7F 7B 5B 6C 50 7D 4D 5D 5C 4E 6B 60 4B 61 3 F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 7A 5E 4C 7E 6E 6F 4 7C C1 C2 C3 C4 C5 C6 C7 C8 C9 D1 D2 D3 D4 D5 D6 5 D7 D8 D9 E2 E3 E4 E5 E6 E7 E8 E9 AD E0 BD 5F 6D 6 79 81 82 83 84 85 86 87 88 89 91 92 93 94 95 96 7 97 98 99 A2 A3 A4 A5 A6 A7 A8 A9 C0 4F D0 A1 07 8 20 21 22 23 24 15 06 17 28 29 2A 2B 2C 09 0A 1B 9 30 31 1A 33 34 35 36 08 38 39 3A 3B 04 14 3E FF A 41 42 43 44 45 46 47 48 49 4A 51 52 53 54 55 56 B 57 58 59 62 63 64 65 66 67 68 69 6A 70 71 72 73 C 74 75 76 77 78 80 8A 8B 8C 8D 8E 8F 90 9A 9B 9C D 9D 9E 9F A0 AA AB AC AE AF B0 B1 B2 B3 B4 B5 B6 E B7 B8 B9 BA BB BC BE BF CA CB CC CD CE CF DA DB F DC DD DE DF E1 EA EB EC ED EE EF FA FB FC FD FE _X EBCDIC - ASCII X_ 0 1 2 3 4 5 6 7 8 9 A B C D E F 0 00 01 02 03 9C 09 86 7F 97 8D 8E 0B 0C 0D 0E 0F 1 10 11 12 13 9D 85 08 87 18 19 92 8F 1C 1D 1E 1F 2 80 81 82 83 84 0A 17 1B 88 89 8A 8B 8C 05 06 07 3 90 91 16 93 94 95 96 04 98 99 9A 9B 14 15 9E 1A 4 20 A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 2E 3C 28 2B 7C 5 26 AA AB AC AD AE AF B0 B1 B2 21 24 2A 29 3B 5E 6 2D 2F B3 B4 B5 B6 B7 B8 B9 BA BB 2C 25 5F 3E 3F 7 BC BD BE BF C0 C1 C2 C3 C4 60 3A 23 40 27 3D 22 8 C5 61 62 63 64 65 66 67 68 69 C6 C7 C8 C9 CA CB 9 CC 6A 6B 6C 6D 6E 6F 70 71 72 CD CE CF D0 D1 D2 A D3 7E 73 74 75 76 77 78 79 7A D4 D5 D6 5B D7 D8 B D9 DA DB DC DD DE DF E0 E1 E2 E3 E4 E5 5D E6 E7 C 7B 41 42 43 44 45 46 47 48 49 E8 E9 EA EB EC ED D 7D 4A 4B 4C 4D 4E 4F 50 51 52 EE EF F0 F1 F2 F3 E 5C F4 53 54 55 56 57 58 59 5A F5 F6 F7 F8 F9 FA F 30 31 32 33 34 35 36 37 38 39 FB FC FD FE FF 9F
You'll have to translate anything above 7E manually. There is no ASCII value for 97: it depends on the code page you are using.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.