BookmarkSubscribeRSS Feed
HoustonGSC
Calcite | Level 5
Does SAS have the capability to look at a z/os print dataset and address an entire page versus individual lines.
I need to be able to extract certain pages if they match a certain criteria.

The selection criteria is located about 3-4 lines into the report so I have to be able to build an entire page first to know whether I want to keep that page or not.

Any ideas would be greatly appreciated.
23 REPLIES 23
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
The z/OS (IBM mainframe) SYSOUT-directed report or output may have a non-blank character in column 1.

The straight answer to your question is no -- there is no magic here.

However, you may consider reading an SYSOUT-type output using a DATA step, keeping a variable counter as you move to successive new pages, to create a temporary SAS (or temporary sequential file). Then using your own filtering/subsetting criteria, identify what page(s) need to be saved/copied/printed or whatever, in your SAS program.

Also, I would suggest that this type of external data (report) processing is not necessarily unique to any given OS platform.

Scott Barry
SBBWorks, Inc.

Google advanced search argument, this topic/post:

data step programming reading external data site:sas.com
HoustonGSC
Calcite | Level 5
That was along the lines I was thinking as well, I can build a page by writing all the line records to a temporary sas dataset and then during new page logic, either dump the temporary dataset to a print file if the page contains the criteria desired or clear/delete the temporary dataset if it does not.
I do not work with SAS on a daily basis and it has been probably over a year since I've had to code a new program.
Would I use PROC DATASETS to copy the temp dataset to the print file and then delete it? Or would PROC PRINTTO work? Note sure what would be the best method to dump the contents of the temp dataset and then clear it?
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
No PROC supports writing/reading this type of external SYSOUT-related file. You will need to use a SAS DATA step with INFILE pointing to an external file where you have captured the SYSOUT report (to a dataset, not a JES-managed SYSOUT= location). For this exercise, all processing will need to be hand-coded.

Scott Barry
SBBWorks, Inc.
HoustonGSC
Calcite | Level 5
Understood, the sysout has already been captured to a file on the MF and I have it defined as an INFILE to a data step. I have everything coded but just needed some help on how to process the temporary datasets.
I will figure it out as I always do.
Cynthia_sas
SAS Super FREQ
Hi:
Long ago, in a galaxy far, far away, when I worked in a MF shop, the only historical record for some reports was the MF SYSOUT file -- with carriage controls and all. We used to routinely "scrape" the MF SYSOUT file with DATA step programs (INFILE/INPUT) for report information so we could compare this year's info to 12 years ago.

One good trick is to use $VARYING and to read each line as a huge character string. If you use the LENGTH= option on the INFILE statement, you can detect "empty" lines, because the LENGTH variable will be 0 for those lines. Then you can parse the line looking for your text of interest. Or, you might know that a column header is always spelled: YRLY AMT -- so when you find that, you know that the line immediately -after- that line is where your data begins. The single trailing @ is a big help here, because you can hold a line while you figure out whether you want it and then because MF report generally have spaces between pieces of information or (in the old days) characters like '|' between columns, you could then re-read the line with list input to parse out what you want.

There may not be any programs hanging out anymore that explain report scraping, but people still do parse the SAS log for error messages, so you might want to search through SAS user group papers for how to read a "captured" SAS log file with a program.

cynthia
HoustonGSC
Calcite | Level 5
Hi Cynthia and thanks for the feedback. I actually have some programs that use the $varying option but most of these are looking for something within 1 or 2 records. The task I am working with is unusual because it requires me to build the entire page and then deciding whether to keep it or reject.
I have searched the archives a bit but it was hard to find the right expression to search on. I will take your suggestion and do some more searching to see if I can find something useful.

Thanks again,
Gil.
Cynthia_sas
SAS Super FREQ
Hi:
I'd be tempted to build one observation per page, using an array so each line on the page would be one huge character string...something like this:
[pre]
array ln $200 ln1-ln60;
[/pre]

Then you could use a DO loop to zip through each page, because each page is an obs and then you could set a yes/no flag variable about whether you want this page or not. Depending on the MF report, your SYSOUT file is usually 132 or 133 characters "wide" (linesize) by 55-60 lines "long" (Pagesize) -- so unless your report is thousands of pages, the time to DO loop over every obs won't be too bad. Then if your DO loop logic reveals whether to keep the page or not, you can either "unarray" in a separate Data step or in the same Data step to create the final report.

I still don't understand the "build the page" and then keep or reject it scenario??? If you are scraping a MF report, the report is already built. Do you need to rebuild it a second time???

cynthia
HoustonGSC
Calcite | Level 5
This sounds very interesting. I have taken a few vacation days off but when I return I will look at doing something like what you describe. The lrecl=161 and the report is only several hundred pages.
The reason I say I have to "build the page" is because if a page contains 60 records, the entire 60 records are "tied" together. They are either accepted or rejected as a whole.
I should know whether I want to keep the page or not by about the 4th record, however, I can't just stop my processing if the record is not a match. I have to continue to process the remaining records until a new form feed is found. At that point, I can discard all of the previous records and then start again.

I would appreciate any examples (or links) of what you described above if you have any on hand.

Thanks again,
Gil.
data_null__
Jade | Level 19
Can you provide more specific information about the contents and record format.

For example if each page is PS records then you can easily read each line into and enumerated list of variables (array). Then figure out if you want it or not.

If each page has to be determined by ANSI carriage control then you have to check the CC column and process accordingly. You would still need an enumerated list of variables. But the way you fill it is different. Plus as you fill the variable list you can decide if the page is needed and “short circuit” to the start of the next page.

Here is example that may be helpful... My OS is Winders so you may need different INFILE and FILE statement options.

[pre]
filename FT15F001 temp recfm=V;
data _null_;
infile FT15F001 length=l eof=eof;
array line[10] $80;
do _n_ = 1 by 1 until(cc eq '1');
input line[_n_] $varying80. l @;
if indexw(line[_n_],'bad') then do;
do until(cc eq '1');
input / cc $varying1. l @;
end;
input @1 @@;
delete;
end;
input / cc $varying1. l @1 @;
end;
eof:
file print noprint;
do _n_ = 1 to dim(line);
l = length(line[_n_]);
put line[_n_] $varying80. l;
end;
parmcards;
1This is page 1
This is a bad page
more stuff
0more stuff
more stuff
more stuff
end of page
1This is page 2
0This is a good page badly
-more stuff
more stuff
more stuff
end of page
1This is page 3
more stuff
This is a bad page
end of page
1This is page 4
0more stuff
This is a Good page
0more stuff
more stuff
end of page
;;;;
run;
[/pre]
HoustonGSC
Calcite | Level 5
Sorry for the late response, just got back from spring break with the kiddos.

The lrecl=161 and recfm=fba and it is a ps file.

The first 72 characters of the first few records look like this :

1 $$DJDE JDE=STD15,SIDE=(NUFRONT,NOFFSET),END;
JNM: GSGB020 P R I V A T E L A B E L
PGM: GSGB020 BOOK A G R O U P B U Y O R D E R
0


CA
ITEM PACK SIZE DESCRIPTION O.I. REBATE CO

0 0001 PRIVATE LABEL MANDARIN ORANGES


The variable string I am looking for is the BOOK A string.

I have worked with arrays in perl and VB script but not in SAS so forgive me if I am not reading your logic correctly.

It looks like you are reading the first 10 variables into an array (I'm guessing that a space is the delimiter) and then searching each array item and looking for the target string.
If you find the string then you keep the line but you delete the line if target string not found.

Is this correct? Or are you reading the first 10 lines into an array?
data_null__
Jade | Level 19
Based on the new info you provided I have changed the program to read each line of each page into an array LINE
  • and if “BOOK A” is found then the page is kept. This is a bit simpler than my other program.

    Yes, each line of each page is read into the LINE array. There is only one page at time in the array, and then if it is a good page it is printed. If you know exactly where the KEEP info is you can short circuit the reading similar to my original program.

    The elements of the LINE array are variables. The value of each variable is the entire contents of a line. I don't know if I answered all your questions.
    [pre]
    data _null_;
    infile FT15F001 length=l eof=eof;
    array line[20] $161; *array dim should equal report page size;
    keeperFlag = 0;
    do _n_ = 1 by 1 until(cc eq '1');
    input @1 line[_n_] $char161. @@;
    putlog _infile_;
    if not keeperFlag then if index(line[_n_],'BOOK A') then keeperFlag = 1;
    input / @1 cc $1. @@;
    end;
    eof:
    file print noprint ls=161;
    if keeperFlag then do _n_ = 1 to dim(line);
    put line[_n_] $char161.;
    end;
    parmcards4;
    1$$DJDE JDE=STD15,SIDE=(NUFRONT,NOFFSET),END;
    JNM: GSGB020 P R I V A T E L A B E L
    PGM: GSGB020 BOOK A G R O U P B U Y O R D E R
    0


    CA
    ITEM PACK SIZE DESCRIPTION O.I. REBATE CO

    0 0001 PRIVATE LABEL MANDARIN ORANGES
    1$$DJDE JDE=STD15,SIDE=(NUFRONT,NOFFSET),END;
    JNM: GSGB020 P R I V A T E L A B E L
    PGM: GSGB020 BOOK B G R O U P B U Y O R D E R
    0


    CA
    ITEM PACK SIZE DESCRIPTION O.I. REBATE CO

    0 0002 PRIVATE LABEL MANDARIN ORANGES
    1$$DJDE JDE=STD15,SIDE=(NUFRONT,NOFFSET),END;
    JNM: GSGB020 P R I V A T E L A B E L
    PGM: GSGB020 BOOK A G R O U P B U Y O R D E R
    0


    CA
    ITEM PACK SIZE DESCRIPTION O.I. REBATE CO

    0 0003 PRIVATE LABEL MANDARIN ORANGES
    1$$DJDE JDE=STD15,SIDE=(NUFRONT,NOFFSET),END;
    JNM: GSGB020 P R I V A T E L A B E L
    PGM: GSGB020 BOOK C G R O U P B U Y O R D E R
    0

    CA
    ITEM PACK SIZE DESCRIPTION O.I. REBATE CO

    0 0004 PRIVATE LABEL MANDARIN ORANGES
    ;;;;
    run;
    [/pre]
  • HoustonGSC
    Calcite | Level 5
    Thanks for your feedback, it has been extremely helpful!

    It's taken me a little while to look at the code and review the usage notes for arrays and macros but I think I finally understand most of what you and Cynthia were getting at.

    I have included the code below and it is almost working as desired.

    The problem I am running into is that the print file has already been created and there are not a fixed number of lines per page. They vary according to the number of detail lines.

    If I specify an array number that is the max number of lines per page, then if there are less than this amount, it fills the page with blank lines at the end so it is not a true copy of the original file.

    I tried specifying an array LINE{*} but this gives me the error:
    The array LINE has been defined with zero elements

    Is it possible to track the number of lines per page (CC=1) and then pass this to the array definition or something like that?

    //STEP17 EXEC SASPROC
    //SASIN DD DISP=SHR,DSN=PL.DISK.GSGB020B.GRPBUY.BOOKS(0)
    //BKS2PRT DD *
    BOOKA
    BOOKB
    BOOKC
    BOOKZ
    //SASLIST DD SYSOUT=A
    //SYSIN DD *
    OPTIONS MPRINT MLOGIC SYMBOLGEN;
    /**********************************************************/
    /* STORE BOOKS TO PRINT AS MACRO VARIABLES */
    /**********************************************************/
    DATA _NULL_;
    INFILE BKS2PRT LENGTH=CARDIN END=EOF;
    DO _N_ = 1 BY 1 UNTIL(EOF);
    INPUT @1 ABKS2PRT $VARYING80. CARDIN ;
    CALL SYMPUT(CATS(ABKS2PRT), CATS(ABKS2PRT));
    END;
    RUN;
    /******************************************************/
    /* PRINT ONLY THE BOOKS SPECIFIED IN THE */
    /* BKS2PRT FILE ABOVE. */
    /******************************************************/
    DATA _NULL_ ;
    INFILE SASIN LENGTH=RECLEN EOF=EOF;
    ARRAY LINE{66} $161 ;
    KEEPERFLAG = 0;
    DO _N_ = 1 BY 1 UNTIL(CC EQ '1');
    INPUT @1 LINE{_N_} $CHAR161. @@;
    IF NOT KEEPERFLAG THEN
    DO;
    PGMID=INDEX(LINE{_N_},'PGM: GSGB020 ');
    IF PGMID GT 0 THEN
    DO;
    BOOKID = SUBSTR(LINE{_N_}, PGMID+23, 1);
    BOOKS2PRT = '&BOOK'||BOOKID;
    CURBOOK = 'BOOK'||BOOKID;
    IF CURBOOK = RESOLVE(BOOKS2PRT) THEN KEEPERFLAG= 1;
    END;
    END;
    INPUT / @1 CC $1. @@;
    END;
    EOF:
    FILE PRINT NOPRINT LS=161;
    IF KEEPERFLAG THEN DO _N_ = 1 TO DIM(LINE);
    PUT LINE{_N_} $CHAR161.;
    END;
    RUN;
    data_null__
    Jade | Level 19
    _N_ is being used to count lines as each page is read.

    This value should be 1 larger than the number of lines read for a KEEPER. One more because the first line of the "next" page is read unless the page is also at end of file in which case you don't want to subtract one.

    Add the END= option to the INFILE statement e.g.

    [pre] infile FT15F001 length=l eof=eof end=eof;[/pre]

    Also with FIXED length records the INFILE option LENGTH is not necessary I forgot to remove it when I changed the example to fixed records.

    and change this line

    [pre]
    IF KEEPERFLAG THEN DO _N_ = 1 TO DIM(LINE);
    [/pre]
    to
    [pre]
    IF KEEPERFLAG THEN DO _N_ = 1 TO _n_ - (not eof);
    [/pre]

    should produce the desired result.
    HoustonGSC
    Calcite | Level 5
    Thanks, that just about did it.

    Everything looked good except the last line of the page was not printing.

    Then I remembered that I had read something about a difference between the do while and a do until loop where the while test is evaluated at the top of the loop and the until test is evaluated at the bottom of the loop.

    So I reversed the:
    DO _N_ = 1 BY 1 UNTIL(CC EQ '1'); to DO _N_ = 1 BY 1 WHILE(CC NE '1');
    and now it is working perfectly.

    I really appreciate your assistance with this, I wouldn't have had a clue w/o your help.

    Thanks again.

    sas-innovate-2024.png

    Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

    Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

     

    Register now!

    What is Bayesian Analysis?

    Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

    Find more tutorials on the SAS Users YouTube channel.

    Click image to register for webinarClick image to register for webinar

    Classroom Training Available!

    Select SAS Training centers are offering in-person courses. View upcoming courses for:

    View all other training opportunities.

    Discussion stats
    • 23 replies
    • 1131 views
    • 0 likes
    • 5 in conversation