- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I am attempting to extract packed decimal data from an IBM mainframe. The data as we received is not valid packed decimal but rather appears to have been translated from EBCDIC to ASCII. Is there a built-in format or function in SAS that will translate ASCII data to EBCDIC? If so, is it possible to specify the which translation table should be used? By converting the ASCII codes to EBCDIC in the ISO_1 ASCII-to-EBCDIC translation table on this page I am able to parse out the correct packed decimal data.
For example, a test case I have has a true value of 84.47 (or 08447C as packed decimal). The code below creates a test file containing the true value and the data as we receive it (hex value 97 E0 40). Following the table linked above, those coordinates correspond to the true values 08 44 7C. I am looking for a method that will convert the observed hex values to the correct decimal values. I've gotten close using this combination of hex, ebcdic formats and the byte function, but upon running this converts the "97" hex value to "00" when I am looking for "08". I am looking to see if anyone knows of a simpler and correct way to do this.
filename pd 'C:\pdexample.txt'; data _null_; file 'C:\pdexample.txt'; true=84.47; observed = 99533.44; put true S370FPD6.2; put observed S370FIB6.2; run; data testcase; infile 'C:\pdexample.txt'; input @1 val_Raw $6. @1 val_FIB S370FIB6.0 ; run; data testcase2; set testcase; val_Hex = put(val_FIB,$hex.); array byte_ {4} $ B1 B2 B3 B4; array pdconv {4} $ PD1 PD2 PD3 PD4; do i=1 to dim(byte_); byte_{i} = substr(val_hex,i*2-1,2); pdconv{i} = put(put(byte(input(byte_{i},hex.)),$ebcdic.),$hex.); end; val_pd = cats(PD1,PD2,PD3,PD4); if substr(val_pd,length(val_pd),1)="C" then val = substr(val_pd,1,length(val_pd)-1)/100; if substr(val_pd,length(val_pd),1)="D" then val = substr(val_pd,1,length(val_pd)-1)/-100; run;
I hope I've made the issue evident but I am happy to clarify anything, as I am not entirely sure the best question to ask or how to ask it. Any guidance is appreciated!
I am running SAS 9.4TSM6 on Windows Server 2016.
Thank you,
Brian
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You could try using the translation tables the KCVT() function uses:
data want;
set have;
WANT = kcvt(HAVE,'ascii','ebcdic');
run;
Here is what those tables look like:
data test;
do decimal=0 to 255 ;
hex=put(decimal,hex2.);
ascii = byte(decimal);
ebcdic = kcvt(ascii,'ascii','ebcdic');
ascii2 = kcvt(ebcdic,'ebcdic','ascii');
output;
end;
format ascii ebcdic ascii2 $hex2. ;
run;
data _null_;
put ' _X ASCII - EBCDIC'
/'X_ ' @
;
do low=0 to 15 ;
put ' ' low hex1. @;
end;
put;
do high=0 to 15 ;
put ' ' high hex1. ' ' @;
do low=0 to 15 ;
set test;
put ' ' ebcdic $hex2. @ ;
end;
put;
end;
stop;
run;
proc sort data=test out=test2; by ebcdic ; run;
data _null_;
put ' _X EBCDIC - ASCII'
/'X_ ' @
;
do low=0 to 15 ;
put ' ' low hex1. @;
end;
put;
do high=0 to 15 ;
put ' ' high hex1. ' ' @;
do low=0 to 15 ;
set test2;
put ' ' ascii $hex2. @ ;
end;
put;
end;
stop;
run;
_X ASCII - EBCDIC X_ 0 1 2 3 4 5 6 7 8 9 A B C D E F 0 00 01 02 03 37 2D 2E 2F 16 05 25 0B 0C 0D 0E 0F 1 10 11 12 13 3C 3D 32 26 18 19 3F 27 1C 1D 1E 1F 2 40 5A 7F 7B 5B 6C 50 7D 4D 5D 5C 4E 6B 60 4B 61 3 F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 7A 5E 4C 7E 6E 6F 4 7C C1 C2 C3 C4 C5 C6 C7 C8 C9 D1 D2 D3 D4 D5 D6 5 D7 D8 D9 E2 E3 E4 E5 E6 E7 E8 E9 AD E0 BD 5F 6D 6 79 81 82 83 84 85 86 87 88 89 91 92 93 94 95 96 7 97 98 99 A2 A3 A4 A5 A6 A7 A8 A9 C0 4F D0 A1 07 8 20 21 22 23 24 15 06 17 28 29 2A 2B 2C 09 0A 1B 9 30 31 1A 33 34 35 36 08 38 39 3A 3B 04 14 3E FF A 41 42 43 44 45 46 47 48 49 4A 51 52 53 54 55 56 B 57 58 59 62 63 64 65 66 67 68 69 6A 70 71 72 73 C 74 75 76 77 78 80 8A 8B 8C 8D 8E 8F 90 9A 9B 9C D 9D 9E 9F A0 AA AB AC AE AF B0 B1 B2 B3 B4 B5 B6 E B7 B8 B9 BA BB BC BE BF CA CB CC CD CE CF DA DB F DC DD DE DF E1 EA EB EC ED EE EF FA FB FC FD FE _X EBCDIC - ASCII X_ 0 1 2 3 4 5 6 7 8 9 A B C D E F 0 00 01 02 03 9C 09 86 7F 97 8D 8E 0B 0C 0D 0E 0F 1 10 11 12 13 9D 85 08 87 18 19 92 8F 1C 1D 1E 1F 2 80 81 82 83 84 0A 17 1B 88 89 8A 8B 8C 05 06 07 3 90 91 16 93 94 95 96 04 98 99 9A 9B 14 15 9E 1A 4 20 A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 2E 3C 28 2B 7C 5 26 AA AB AC AD AE AF B0 B1 B2 21 24 2A 29 3B 5E 6 2D 2F B3 B4 B5 B6 B7 B8 B9 BA BB 2C 25 5F 3E 3F 7 BC BD BE BF C0 C1 C2 C3 C4 60 3A 23 40 27 3D 22 8 C5 61 62 63 64 65 66 67 68 69 C6 C7 C8 C9 CA CB 9 CC 6A 6B 6C 6D 6E 6F 70 71 72 CD CE CF D0 D1 D2 A D3 7E 73 74 75 76 77 78 79 7A D4 D5 D6 5B D7 D8 B D9 DA DB DC DD DE DF E0 E1 E2 E3 E4 E5 5D E6 E7 C 7B 41 42 43 44 45 46 47 48 49 E8 E9 EA EB EC ED D 7D 4A 4B 4C 4D 4E 4F 50 51 52 EE EF F0 F1 F2 F3 E 5C F4 53 54 55 56 57 58 59 5A F5 F6 F7 F8 F9 FA F 30 31 32 33 34 35 36 37 38 39 FB FC FD FE FF 9F
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Typically IBM mainframe data copied to an ASCII computer is best transferred in binary format. Then you use the S370 SAS informats to translate the data from native EBCDIC into ASCII form. What is the problem with getting another binary transfer? Any other approach would be a pain in the proverbial as I think you are discovering.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
We may need to reach out for another transfer if all else fails. The problem is partially time, getting the data from the client in the format we need, and size of data - we'd have probably around 200-300GB of data that would need transferring and re-extracting on a tight timeline. If we can even kludge together a not-so-pretty solution with data we have in-house that would be preferable.
It seems like the correct data are there it's just I don't know how to correctly extract it. But you're saying (and corroborating what I've read online elsewhere) that "correctly extracting" the data is basically a lost cause given the weird conversions that took place?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Are you saying someone/something tried to translate the binary bytes like '44'x as if it was an EBCDIC code?
That code has no meaning in EBCDIC.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the response. I'm not entirely sure what translation took place, but I do know that when I view the source data of purported packed decimal, I see invalid PD data (e.g. the "44" is coming in to our system as "E0"). As far as I can tell, packed decimal data is getting converted to ASCII following the "ISO-1" convention of EBCDIC to ASCII:
If I can back-out the translation using the above table (or rather, ASCII to EBCDIC) then I can reconstruct the valid packed decimal data. Is there a function or format in SAS to convert ASCII to EBCDIC codes following a convention such as the one above?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
AFAIK, SAS doesn't have an easy way to undo an EBCDIC to ASCII translation. Resending the mainframe file using FTP with the BINARY option seems like a no brainer to me - SAS has functionality to do that.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lestmtsglobal/p0v0ijxl1k6d4bn16cshtic7u4i3.ht...
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes, correct. I've used this approach myself in the dim, distant past and it works really well.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You can connect directly to the FTP server on the mainframe from SAS, using the proper options. This means that you need to know the exact name of the file on the MF, and its organization (FB, VB, ...)
If it is FB, you read with RECFM=N and the exact LRECL as defined, if it is VB, see here.
Then you read the file using MF-informats ($EBCDIC, S370FPD and its brethren).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you can convert that picture into data then you could just use the TRANSLATE() function.
data want;
set have;
ascii = translate(ebcdic,collate(0,255),'000102039c.....'x);
run;
But that assumes you know what mapping was used and that it was 1-1 (or at least 1-1 for any possible byte that might appear in the PD field).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for suggesting this approach. Assuming this is the correct translation (and if it's 1-1) I tried implementing on a test case that only used one row of the data table. Here I'm trying to import the decimal value "8". (I get the "observed" value of 1.51 by converting "97" to hex=151, and 97 in turn is derived from that table linked above with coordinates "08". A hex view of the data confirms.)
filename pd 'C:\pdexample2.txt';
data _null_;
file 'C:\pdexample2.txt';
true=8;
observed = 1.51;
put true S370FPD6.2;
put observed S370FIB6.2;
run;
data testcase;
infile 'C:\pdexample2.txt';
input
@1 val_Raw $6.
@1 val_FIB S370FIB6.0
;
run;
data want;
set testcase;
ascii = translate(val_Raw,collate(0,15),'000102039C09867F978D8E0B0C0D0E0F'x);
run;
When I run this code, I am not sure how to use the result (I am mainly looking at row 2, since that is the data how we observe it):
Can you clarify what I would do with the result? Does this need to be translated again using another function or format?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Once you get the text string converted back to the original binary text you still need to use one of the S3270 informats to convert the string into the number it represents.
Note that for the S370FPD informat the width is the number of bytes in the string to read and the decimal is where to place the implied decimal point. So it is possible to have the decimal value be larger than the width value, like S370FPD6.7, since the number of decimal digits is two times the width.
data test;
length badtext goodtext $3;
format badtext goodtext $hex6.;
badtext='97E040'x;
goodtext = translate(badtext,'08447C'x, '97E040'x);
want=84.47;
number = input(goodtext,S370FPD3.2);
put (_all_) (=/);
run;
badtext=97E040 goodtext=08447C want=84.47 number=84.47
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You could try using the translation tables the KCVT() function uses:
data want;
set have;
WANT = kcvt(HAVE,'ascii','ebcdic');
run;
Here is what those tables look like:
data test;
do decimal=0 to 255 ;
hex=put(decimal,hex2.);
ascii = byte(decimal);
ebcdic = kcvt(ascii,'ascii','ebcdic');
ascii2 = kcvt(ebcdic,'ebcdic','ascii');
output;
end;
format ascii ebcdic ascii2 $hex2. ;
run;
data _null_;
put ' _X ASCII - EBCDIC'
/'X_ ' @
;
do low=0 to 15 ;
put ' ' low hex1. @;
end;
put;
do high=0 to 15 ;
put ' ' high hex1. ' ' @;
do low=0 to 15 ;
set test;
put ' ' ebcdic $hex2. @ ;
end;
put;
end;
stop;
run;
proc sort data=test out=test2; by ebcdic ; run;
data _null_;
put ' _X EBCDIC - ASCII'
/'X_ ' @
;
do low=0 to 15 ;
put ' ' low hex1. @;
end;
put;
do high=0 to 15 ;
put ' ' high hex1. ' ' @;
do low=0 to 15 ;
set test2;
put ' ' ascii $hex2. @ ;
end;
put;
end;
stop;
run;
_X ASCII - EBCDIC X_ 0 1 2 3 4 5 6 7 8 9 A B C D E F 0 00 01 02 03 37 2D 2E 2F 16 05 25 0B 0C 0D 0E 0F 1 10 11 12 13 3C 3D 32 26 18 19 3F 27 1C 1D 1E 1F 2 40 5A 7F 7B 5B 6C 50 7D 4D 5D 5C 4E 6B 60 4B 61 3 F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 7A 5E 4C 7E 6E 6F 4 7C C1 C2 C3 C4 C5 C6 C7 C8 C9 D1 D2 D3 D4 D5 D6 5 D7 D8 D9 E2 E3 E4 E5 E6 E7 E8 E9 AD E0 BD 5F 6D 6 79 81 82 83 84 85 86 87 88 89 91 92 93 94 95 96 7 97 98 99 A2 A3 A4 A5 A6 A7 A8 A9 C0 4F D0 A1 07 8 20 21 22 23 24 15 06 17 28 29 2A 2B 2C 09 0A 1B 9 30 31 1A 33 34 35 36 08 38 39 3A 3B 04 14 3E FF A 41 42 43 44 45 46 47 48 49 4A 51 52 53 54 55 56 B 57 58 59 62 63 64 65 66 67 68 69 6A 70 71 72 73 C 74 75 76 77 78 80 8A 8B 8C 8D 8E 8F 90 9A 9B 9C D 9D 9E 9F A0 AA AB AC AE AF B0 B1 B2 B3 B4 B5 B6 E B7 B8 B9 BA BB BC BE BF CA CB CC CD CE CF DA DB F DC DD DE DF E1 EA EB EC ED EE EF FA FB FC FD FE _X EBCDIC - ASCII X_ 0 1 2 3 4 5 6 7 8 9 A B C D E F 0 00 01 02 03 9C 09 86 7F 97 8D 8E 0B 0C 0D 0E 0F 1 10 11 12 13 9D 85 08 87 18 19 92 8F 1C 1D 1E 1F 2 80 81 82 83 84 0A 17 1B 88 89 8A 8B 8C 05 06 07 3 90 91 16 93 94 95 96 04 98 99 9A 9B 14 15 9E 1A 4 20 A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 2E 3C 28 2B 7C 5 26 AA AB AC AD AE AF B0 B1 B2 21 24 2A 29 3B 5E 6 2D 2F B3 B4 B5 B6 B7 B8 B9 BA BB 2C 25 5F 3E 3F 7 BC BD BE BF C0 C1 C2 C3 C4 60 3A 23 40 27 3D 22 8 C5 61 62 63 64 65 66 67 68 69 C6 C7 C8 C9 CA CB 9 CC 6A 6B 6C 6D 6E 6F 70 71 72 CD CE CF D0 D1 D2 A D3 7E 73 74 75 76 77 78 79 7A D4 D5 D6 5B D7 D8 B D9 DA DB DC DD DE DF E0 E1 E2 E3 E4 E5 5D E6 E7 C 7B 41 42 43 44 45 46 47 48 49 E8 E9 EA EB EC ED D 7D 4A 4B 4C 4D 4E 4F 50 51 52 EE EF F0 F1 F2 F3 E 5C F4 53 54 55 56 57 58 59 5A F5 F6 F7 F8 F9 FA F 30 31 32 33 34 35 36 37 38 39 FB FC FD FE FF 9F
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You'll have to translate anything above 7E manually. There is no ASCII value for 97: it depends on the code page you are using.