Creating a file of bypassed records from reading an input file - Page 2

Tom · Posted 06-16-2024 09:59 PM

You are still missing the semicolon on the FILE statement. (and your indentation is still very confused).

Assuming you fixed the FILE statement that code would NOT write to both of those text files. At least not the little snippet. But since you did not DELETE the observation that you wrote to the BYPASS file than that observation would make it into the DATASET that the step is creating.

Why are you writing to a file you are pointing to with a FILEREF (or DDNAME) of DATAIN? Normally that would be the file you want to READ from. Is there some later process that will read from that file? Is that part of this SAS code? Or some other JCL step in your JOB?

BeerSultan · Posted 06-16-2024 11:30 PM

Long-story short...

I'm modifying an existing report that takes hundreds of thousands of insurance policies and calculates financial reserves. A change to a very small subset of policies now has some benefits becoming inactive - so we want to exclude them. However, this indicator data field is only available up to this current point in the job I'm modifying, as it's output file does not contain this status code. The next job step is the summary part, where it would be easy to count up this bypassed-due-to-inactive-status policies. Logically the simplest solution would be to expand the output to include this field so we can just count in the next step - but it is a widely used data structure, and so would involve making changes to a couple dozen other jobs if expanded. We don't have the bandwidth to tackle that kind of work at the moment, but we want to get these policies off the books as we're setting aside not-insignificant amounts of money that could instead be reinvested.

The goal of my process is to find the policies that don't need to have their reserves calculated, pull them out of the mainline processing (the incoming read file) and ship it to a second file. This second file will be sent to the next job step and I'll just count the rows and slap it into the summation section for # of policies skipped due to inactive benefit component. Fulfilling the requirement of the business now, versus in months, and not significantly impacting anything outside of this specific need.

The code I provided has the record on both files coming out of this job.

Kurt_Bremser · Posted 06-17-2024 02:58 AM

While writing to multiple files in one DATA step is possible, because of the need for physical filenames it can be tricky on z/OS.

Instead I would write the data to two WORK datasets (the OUTPUT statement can be given a destination name to write to, which is not possible with PUT), and then run a DATA step for each to create the files.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Tom · Posted 06-17-2024 09:24 AM

I still don't see why this is such a problem. But you still have not explained the process with enough detail for us to help you.

Are you reading in TEXT files (I know that old time IBM mainframes like to call plain old file "data sets" but they have nothing to do with an actual SAS dataset. They are just file.) in this step that does have the information you need?

What is the current OUTPUT of that step? Is it another TEXT file? Or is it a SAS dataset?

If it is a TEXT file then just skip the code that writes the observation to the text file (however many lines it writes) and they will "disappear" from the file consumed by the downstream systems.

If it is SAS dataset then just skip writing to the that dataset. Does the current step have any OUTPUT statements? If so then skip if for those cases. If not then you need to either add an OUTPUT statement for the observations you to make it to the next step or add a DELETE statement for the ones you don't want to make it to the next step.

BeerSultan · Posted 06-17-2024 10:19 AM

I also don't know why it's such a problem, which is why I'm asking for help.

I feel I explained enough that it should be evident what the underlying ask is - like you said this shouldn't be a problem. But, here's a recap; 2 (text)files - the second has no bearing on the ultimate process - come in from the mainframe, it changes a few fields based on criteria, and it outputs a single merged (text)file. This merged output file does not contain the field necessary in the next step to count/summarize in the next step. I cannot expand the output file to accommodate at this time because it has extensive impacts across dozens of other jobs which would need to be changed and tested and validated.

I want the input file with the status code to count the to write a bypass record in the new bypass (text)file when the status code is inactive. I want the record to not be on the existing output file because the values in that file are used to calculate reserve cash, and an inactive policy does not need to reserve because the benefit is gone. I'll count the records in the bypass file in the next step and add it to the summary page. That's it.

Tom · Posted 06-17-2024 12:23 PM

Is this being done in a single data step?

If so then make a simplified version so you are clear how to to read from TWO input files and write to TWO output files in a single step.

So you need FOUR files. You can define four DDNAMES in your JCL (or have four FILENAME statements in your SAS program). Two are DISP=OLD or DISP=CATLG since they already exist. Two are DISP=NEW since you are making them.

Personally if I was reading from two input files the code would look something like this:

* Get data from the first file;
data one;
  infile in1;
  input ....;
run;

* Get data from the second file;
data two;
  infile in2;
  input ....;
run;

* Some step(s) that combined ONE and TWO into  THREE ;

* Write data from the resulting dataset back out;
data _null_;
   set three;
   if condition='normal' then do;
     file out1;
     put .... ;
   end;
   else do;
     file out2;
     put .... CONDITION ;
   end;
run;

If that is not how your current SAS code works then explain your data flow in more detail. (Or perhaps just change this step to work that way for simplicity.)

Let's make a trivial example. So your current process is writing a TEXT file that has variables NAME, AGE, SEX, AGE, HEIGHT and WEIGHT (so it matches SASHELP.CLASS). Your process has changed and you now have data that includes a new variable named FLAG. You want to continue to write the same 6 variables to the same output file. But you want to skip the observations where FLAG=1. And instead you want to write those observations to some other new data file with a new structure.

Let's call the original target file OLD and the new file (for the skipped observations) NEW. So your original program might look like this. You make some dataset that you want to write. Then you write it.

1    * Original code;
2    data analysis;
3      set sashelp.class;
4    run;

NOTE: The data set WORK.ANALYSIS has 19 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds


5    data _null_;
6      set analysis;
7      file out;
8      put name $8 sex $1 age 3. Height 7.2 weight 7.2 ;
9    run;

NOTE: The file OUT is:
      (system-specific pathname),
      (system-specific file attributes)

NOTE: 19 records were written to the file (system-specific pathname).
      The minimum record length was 18.
      The maximum record length was 18.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

The new process will look like this. The step that makes the analysis step has changed. It now adds a new variable. Let's call it flag. And the step that writes the file out now needs to write two files based on the value of FLAG.

11   * New code;
12   data analysis;
13     set sashelp.class;
14     flag = (sex='M');
15   run;

NOTE: The data set WORK.ANALYSIS has 19 observations and 6 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


16   data _null_;
17     set analysis;
18     if flag=1 then do;
19       file out;
20       put name $8 sex $1 age 3. Height 7.2 weight 7.2 ;
21     end;
22     else do;
23       file new;
24       put name $8 sex $1 age 3. Height 7.2 weight 7.2 flag 1. ;
25     end;
26   run;

NOTE: The file OUT is:
      (system-specific pathname),
      (system-specific file attributes)

NOTE: The file NEW is:
      (system-specific pathname),
      (system-specific file attributes)

NOTE: 10 records were written to the file (system-specific pathname).
      The minimum record length was 18.
      The maximum record length was 18.
NOTE: 9 records were written to the file (system-specific pathname).
      The minimum record length was 19.
      The maximum record length was 19.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

So you can check the NOTES that SAS generates to make sure that the right number of lines have been written to each file.

BeerSultan · Posted 06-20-2024 12:52 PM

Coming back full-circle, I was able to get it to work the way I needed.

DATA LTAINPUT;
  INFILE LTAINPUT;
  INPUT
    @ 00001  RECORD_NUMBER                         $CHAR8.
...(many fields)
    @ 01662  COMPONENT_STATUS                      $CHAR1.
...(many additional fields)
  ;
 
  /*******************************************************************/
  /* INACTIVE BENEFIT COMPONENT RECORDS SPLIT TO BYPASS FILE         */
  /*******************************************************************/
  IF COMPONENT_STATUS = 'I' THEN DO;
    FILE LTABYPSS;
    PUT
      @ 00001 POLICY_NUMBER       $CHAR8.;
      DELETE;
    END;
    ELSE DO;
    FILE LTAINPUT
    PUT
        @ 00001  RECORD_NUMBER                         $CHAR8.
... (many fields)
        ;
  END;

Tom · Posted 06-20-2024 12:55 PM

That is what we said all along.

Note your posted code is still missing the semicolon on the second FILE statment. And now it is missing an END statement.

BeerSultan · Posted 06-20-2024 05:38 PM

Previous examples given were not solutioned this way and for the most part directed to entirely separated data steps, so "all along" is a bit inaccurate, but I appreciate the help. Just trying to learn on-the-fly.

Pasted code is executing perfectly to my knowledge - if there's a missing semi-colon it appears immaterial to the execution, and I can't explain that considering I've known SAS for all of an entire week. Similarly, there's no more ENDs in my code - so I can't comment on it being missing or not, I don't know where you would like an additional END placed though.

I've pushed several million records through this over the last 2 days without a problem, and have values confirmation/validation from the financial specialist who owns the output report from this processing. If there's something syntactically wrong I would expect some kind of errors or unpredictable execution, both of which I'm not seeing. My ears/eyes are open for any insight though.

Tom · Posted 06-20-2024 06:29 PM

The proof is in the pudding.

If the SAS log for that data step is showing that the right number of lines were read from the input file and written to the two output files then the code is working.

Re: Creating a file of bypassed records from reading an input file

Re: Creating a file of bypassed records from reading an input file

Re: Creating a file of bypassed records from reading an input file

Re: Creating a file of bypassed records from reading an input file

Re: Creating a file of bypassed records from reading an input file

Re: Creating a file of bypassed records from reading an input file

Re: Creating a file of bypassed records from reading an input file

Re: Creating a file of bypassed records from reading an input file

Re: Creating a file of bypassed records from reading an input file

Re: Creating a file of bypassed records from reading an input file

Ready to join fellow brilliant minds for the SAS Hackathon?

Classroom Training Available!