BookmarkSubscribeRSS Feed
deleted_user
Not applicable
I have a binary file that was originally created on a RISC/AIX platform. The file is being stuffed (ftp'd) into a mainframe dataset that is fixed block (FB), LRECL=80, and a BLOCKSIZE=6160. The file contains a mix of character data, integer, and floating point numbers. I do not have any control over the ftp and the mainframe dataset. I need to be able to read and process this file on the mainframe and was looking for some advice on how to do this.

I am new to SAS on the mainframe, but do have experience with SAS on Sun/Solaris environment reading binary files using recfm=n with the various "ib" informats. I've tried recfm=n on the mainframe, but I get an error message something like "byte positioning is not allowed...". Is it possible to read binary files on the mainframe? What is the correct recfm and infile options that I should be setting?
5 REPLIES 5
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Are you expecting to do byte-level processing or record-level processing? And will the data be handled as ASCII or EBCDIC? One option is to read the data byte by byte using $CHAR INFORMAT for some purpose, but what that purpose may be is unclear from your post, as I can tell.

Scott Barry
SBBWorks, Inc.
deleted_user
Not applicable
Here's some more info for my problem. Sorry it's a bit long, but I wanted to toss it all out there.

The binary file contains several different data structures (records) embedded within it and some of those data structures repeat (nested records with counters built into the previous record). Each of those records have a different lengths (number of bytes). Ultimately, I want to read each of those record types and do some record level processing on them.

I was able to put together a couple of test programs to read this file using various recfm, lrecl, and blocksize options and I do believe I have the correct informats (I'm using $ascii, ib4., ieee8., and some others). Some of the test programs worked (sort of, see following paragraphs) and some did not. I was expecting to use recfm=n and be on my way, but I got that error message about "byte positioning" that I don't understand. According to the SAS mainframe docs, it looks like recfm=n is allowed?????

So, I tried recfm=F and recfm=U with some success, but I cannot process the entire file in either case. After the first few thousand bytes or so, the fields that I had been reading successfully become completely garbled. Eventually, I get error messages like "undetermined I/O failure". I'm guessing that is because the counters that get read become filled with bad data and the loops that I have reading those data structures get messed up.

I can read the entire file byte-by-byte with either recfm=F or U and any lrecl and blocksize, but I don't think that does me any good as I need each of those records as a whole and the fields within them to process.

I believe my problem has to do with the amount of data that SAS is sucking into its internal input buffer when an "input" statement is being executed and the fact that the binary file is being stuffed into that fixed block mainframe dataset with specific lrecl and blocksize settings. Based on some tests that I've run with various lrecl and blocksize setting within SAS, I seem to always run into trouble with the garbled data when the input buffer reaches the end and there are only a few more bytes to process. I have seen my input statements work successfully up to that point and then when it tries to read a field that requires more bytes than what is left on the input buffer (i.e. reading rb8. but there are only 2 bytes left in the buffer), SAS will essentially throw away those remaining bytes and re-fill its internal buffer with the next block of bytes for that field and the following ones. It's at this point when all the data becomes garbled - obviously, because the byte count is off now and I'm reading the wrong bytes for each field.

In my specific case here, the magic number is 6160 (the blocksize of the dataset on the mainframe). Once I reach that many bytes, things go wrong. To see if I can get past some of my problems and understand this better, I have experimented with preallocating datasets on the mainframe with much larger lrecls and blocksizes and pushing binary files into them, but I always run into trouble when SAS reaches "blocksize" number of bytes. If the binary file happens to be small enough to fit into one block on the mainframe, everything is good, but I cannot assume that will be the case. I believe the largest lrecl and blocksize that can be preallocated on the mainframe is ~32K and these files that I'm processing will exceed that.

I believe if I can read these files as a stream of bytes with no magic "boundaries", I will be in good shape. I just don't understand why recfm=n doesn't do the trick. That seems like the obvious solution and one that I'm very familiar with.

And just to reiterate... I am fairly new on the mainframe and SAS on the mainframe so I'm hoping that I'm missing something very obvious here. In my original problem (not in my tests and experiments above), I do not have control over the mainframe dataset and the ftp of the file onto the mainframe. I have to assume that the file will be in a fixed block format with some specific lrecl and blocksize (both currently set to 80 and 6160 respectively).

I hope you followed all that. Thanks for listening and thanks in advance for any/all help.
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
You should consider reloading your data and specify RECFM=U. Also, share your code for some more insightful feedback. Using IDCAMS PRINT with hex-dump may also give you some indication what SAS is considering to be a physical record, and maybe the SAS LIST; statement within a DATA step might also help.

Scott Barry
SBBWorks, Inc.
Cynthia_sas
SAS Super FREQ
Hi:
I'm a mainframe dinosaur -- so take this part of my comment with a tsp of salt -- I worked with SAS on the mainframe in Version 5 and Version 6 in OS/VS1, OS/360, OS/390 and, if you count CMS/VM as a mainframe, on CMS/VM. In my experience (from that long ago), I -had to- use the mainframe RECFM and DCB characteristics in my SAS code -- in other words, if the RECFM was FB on the mainframe file, then I could not (or was told not to), try to read the file as though it is had another RECFM. But those were the old days and perhaps z/OS is different. The SAS doc on reading files on z/OS is here:
http://support.sas.com/documentation/cdl/en/hosto390/59577/HTML/default/chifoptfmain.htm and you should also look for the SAS Companion documentation for your flavor of mainframe install.

RECFM=N is supposed to work to read binary data, but in SAS 8, there were issues with certain informats if you used RECFM=N per this note:
http://support.sas.com/kb/4/163.html

But I really think that none of the above matters much, except as commentary. I recommend that you contact SAS Tech Support, as they could take a closer look at your data and your program and help you come to some resolution.

To open a track with Tech Support, fill out the form at this link:
http://support.sas.com/ctx/supportform/createForm

cynthia

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 2294 views
  • 0 likes
  • 4 in conversation