Solved: Help with the data with header record and detail record.

saslovethemost · Posted 05-19-2021 07:18 PM

I have the following two data steps. I want to get the data with the header record with type 'H' (first char in edidata) for each isaln.lnno(second data step) that is found in edidata(first data step). Type(first char) is 'H' for Header record, blank ' ' for detail record. In the below example, first record is Header(H) record, second record is the detail record. Thank you in advance.

Data edidata;
input type $1.;
datalines;
H210515405510          BURFDG    MPLLRX300 21051523110000  193211350           *                                                                                                                        
 360846572000010382300421050100000311808+00000001429+00000040292+        00000000000+  058750+00000031916+            21051500000000000+00000000000+00000000000+00000000000+                            
 582150361000010391590421050100010760979+00000048200+00000027670+        00000000000+  057500+00000079366+            21051500000000000+00000000000+00000000000+00000000000+                            
H210515514101          BURFDG    MPLLRX300 21051523190000  114211350           *                                                                                                                        
 771935218000040157720321050100000445426+00000002598+00000012921+        00000000000+  072500+00000015690+            21051500000000000+00000000000+00000000000+00000000000+                            
H210515045808          BURFS11   MPLLRX300 21051522161700  5202115             *                                                                                                                        
 65498128007-25008799  21050100003931943+00000013926+00000000000+        00000000000+0 045000+00000025334+            21042600000000000+00000000000+00000000000+00000000000+                            
 61559498007-25008813  21050100005546513+00000015600+00000000000+        00000000000+0 036250+00000109598+            21042800000000000+00000000000+00000000000+00000000000+                            
 73096930407-25008963  21050100012029749+00000033834+00000000000+        00000000000+0 036250+00000067496+            21042200000000000+00000000000+00000000000+00000000000+                            
;
run;

data isaln;
input lnno $9.;
datalines;
360846572
615594980
;
run;

Data want is like below:

H210515405510          BURFDG    MPLLRX300 21051523110000  193211350           *                                                                                                                        
 360846572000010382300421050100000311808+00000001429+00000040292+        00000000000+  058750+00000031916+            21051500000000000+00000000000+00000000000+00000000000+                            
H210515045808          BURFS11   MPLLRX300 21051522161700  5202115             *                                                                                                                        
 61559498007-25008813  21050100005546513+00000015600+00000000000+        00000000000+0 036250+00000109598+            21042800000000000+00000000000+00000000000+00000000000+

Kurt_Bremser · Posted 05-22-2021 01:36 PM

Anyway, this works by using the header as a group key:

data have;
input line $char200.;
datalines;
H210515405510          BURFDG    MPLLRX300 21051523110000  193211350           *                                                                                            
 360846572000010382300421050100000311808+00000001429+00000040292+        00000000000+  058750+00000031916+            21051500000000000+00000000000+00000000000+00000000000+
 582150361000010391590421050100010760979+00000048200+00000027670+        00000000000+  057500+00000079366+            21051500000000000+00000000000+00000000000+00000000000+
H210515514101          BURFDG    MPLLRX300 21051523190000  114211350           *                                                                                            
 771935218000040157720321050100000445426+00000002598+00000012921+        00000000000+  072500+00000015690+            21051500000000000+00000000000+00000000000+00000000000+
H210515045808          BURFS11   MPLLRX300 21051522161700  5202115             *                                                                                            
 65498128007-25008799  21050100003931943+00000013926+00000000000+        00000000000+0 045000+00000025334+            21042600000000000+00000000000+00000000000+00000000000+
 61559498007-25008813  21050100005546513+00000015600+00000000000+        00000000000+0 036250+00000109598+            21042800000000000+00000000000+00000000000+00000000000+
 73096930407-25008963  21050100012029749+00000033834+00000000000+        00000000000+0 036250+00000067496+            21042200000000000+00000000000+00000000000+00000000000+
;
 
data isafile;
input lnno $9.;
datalines;
360846572
615594980
;

data comb;
set have;
retain header;
if substr(line,1,1) = "H"
then header = line;
else output;
run;

data want;
if _n_ = 1
then do;
  length lnno $9;
  declare hash l (dataset:"isafile");
  l.definekey("lnno");
  l.definedone();
  lnno = "";
end;
do until (last.header);
  set comb;
  by header notsorted;
  if l.check(key:substr(line,2,9)) = 0 then flag = 1;
  put flag=;
end;
do until (last.header);
  set comb(rename=(line=_line));
  by header notsorted;
  if flag
  then do;
    if first.header
    then do;
      line = header;
      output;
    end;
    if l.check(key:substr(_line,2,9)) = 0
    then do;
      line = _line;
      output;
    end;
  end;
end;
keep line;
run;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

View solution in original post

saslovethemost · Posted 05-19-2021 09:27 PM

data isafile;
input @001 lnno $9.;
datalines;
360846572
615594980
;
run;
proc sort data=isafile out=isafmt(keep=lnno);by lnno;run;
data isaloan(keep=lnno);
   set isafmt;
   fmtname='ISAMATCH';
   type='C';
   start=lnno;
   label='YES';
run;
proc format cntlin=isaloan;run;
data flatfile;
 input @0001 type $1.;
 datalines;
H210515405510          BURFDG    MPLLRX300 21051523110000  193211350           *                                                                                            
 360846572000010382300421050100000311808+00000001429+00000040292+        00000000000+  058750+00000031916+            21051500000000000+00000000000+00000000000+00000000000+
 582150361000010391590421050100010760979+00000048200+00000027670+        00000000000+  057500+00000079366+            21051500000000000+00000000000+00000000000+00000000000+
H210515514101          BURFDG    MPLLRX300 21051523190000  114211350           *                                                                                            
 771935218000040157720321050100000445426+00000002598+00000012921+        00000000000+  072500+00000015690+            21051500000000000+00000000000+00000000000+00000000000+
H210515045808          BURFS11   MPLLRX300 21051522161700  5202115             *                                                                                            
 65498128007-25008799  21050100003931943+00000013926+00000000000+        00000000000+0 045000+00000025334+            21042600000000000+00000000000+00000000000+00000000000+
 61559498007-25008813  21050100005546513+00000015600+00000000000+        00000000000+0 036250+00000109598+            21042800000000000+00000000000+00000000000+00000000000+
 73096930407-25008963  21050100012029749+00000033834+00000000000+        00000000000+0 036250+00000067496+            21042200000000000+00000000000+00000000000+00000000000+
;
 if type ='H' then
   do;
      input @002 header_rec $198.;
      output flatfile;
   end;
 else if type = ' ' then
   do;
  	  input @0002 dloan         $09.
  	   @;
       if put(dloan,$ISAMATCH.)='YES' then
         do;
	       @0011 rest_rec  $188.;
            output flatfile;
         end;
   end;
run;
proc print data = flatfile;
run;

sorry this code is not working, lot of errors I see, I copied this from zOS mainframe, just to give you an idea what I was trying. Apologies for my errors because of my cobol background.

Astounding · Posted 05-20-2021 11:32 AM

Here are a few issues to address. Once these fixes are in place, you can see what errors remain.

The first one may not be a problem at all, but here it is. In your program, the name of the format is $ISAMATCH (not ISAMATCH). So when you set up FMTNAME, its value should also be "$ISAMATCH" (not ISAMATCH). SAS may figure this out because the TYPE of the format is C (character). But safer to change the FMTNAME value.

Second, the order of your statements in the DATA step is incorrect. When you use DATALINES, it must be the last part of the DATA step. So all the IF THEN statements are not part of the DATA step, and will generate tons of errors. Move the DATALINES section to the end of the DATA step.

Finally, when inputting the header indicator, you need to hold the input line in case more data needs to be read. So this statement is incorrect:

input @0001 type $1.;

To hold the line so you can read DLOAN from the same line, switch to:

input @0001 type $1. @;

Without seeing the log, it's difficult to know whether this fixes everything. But you will soon find out.

Kurt_Bremser · Posted 05-20-2021 01:18 PM

Your "want" is not a dataset, but it looks like another, filtered, flat file.

I would do the following:

Read the original data in a data step, creating two datasets. One dataset contains the header, the other the detail data. RETAIN the "key" variable and keep it in the header and the detail table.

You would get two datasets looking like that:

data header;
input key $ name $;
datalines;
A XXX
B YYY
C ZZZ
;

data detail;
input key $ lnno :$9.;
datalines;
A 360846572
A 582150361
B 771935218
C 654981280
C 615594980
C 730969304
;

First, filter the detail dataset:

data detail_filtered;
set detail;
if _n_ = 1
then do;
  declare hash isaln(dataset:"isaln");
  isaln.definekey("lnno");
  isaln.definedone();
end;
if isaln.check() = 0;
run;

Then, merge the header and the filtered details, keep only matches, and write the file:

data _null_;
file '/folders/myfolders/result.txt';
merge
  header (in=h)
  detail_filtered (in=d)
;
by key;
if h and d;
if first.key then put key $3. name $3.;
put " " lnno;
run;

The resulting file looks like this:

A  XXX
 360846572
C  ZZZ
 615594980

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

saslovethemost · Posted 05-21-2021 04:16 PM

Thank you for your inputs. Sorry I think I did not properly explain the issue I am having.

First dataset/file is like below:

H210515405510          BURFDG    MPLLRX300 21051523110000  193211350           *                                                                                            
 360846572000010382300421050100000311808+00000001429+00000040292+        00000000000+  058750+00000031916+            21051500000000000+00000000000+00000000000+00000000000+
 582150361000010391590421050100010760979+00000048200+00000027670+        00000000000+  057500+00000079366+            21051500000000000+00000000000+00000000000+00000000000+
H210515514101          BURFDG    MPLLRX300 21051523190000  114211350           *                                                                                            
 771935218000040157720321050100000445426+00000002598+00000012921+        00000000000+  072500+00000015690+            21051500000000000+00000000000+00000000000+00000000000+
H210515045808          BURFS11   MPLLRX300 21051522161700  5202115             *                                                                                            
 65498128007-25008799  21050100003931943+00000013926+00000000000+        00000000000+0 045000+00000025334+            21042600000000000+00000000000+00000000000+00000000000+
 61559498007-25008813  21050100005546513+00000015600+00000000000+        00000000000+0 036250+00000109598+            21042800000000000+00000000000+00000000000+00000000000+
 73096930407-25008963  21050100012029749+00000033834+00000000000+        00000000000+0 036250+00000067496+            21042200000000000+00000000000+00000000000+00000000000+

In the above example, the first record starting with 'H' is the header record which has date and time stamp next to it and then some servicer bureau info and the program name(pllrx300). The second row which is the detail record start with a blank and then loan number(first 9 chars) and some information related to the loan. For one header record, there can be single/multiple detail records with loan numbers.

The second dataset has just the loan numbers(9 chars), the second dataset will definitely have some matching loan numbers in the detail record(not with header, remember header record will not have loan number info), For these matching loans in the details records, I want to create a file with the header record with the detail record for the loan numbers matched.

Second dataset/file is like below:

360846572
615594980;

The output I would like to see is like below: Header record and detail record for matching loans in second dataset/file.

H210515405510          BURFDG    MPLLRX300 21051523110000  193211350           *                                                                                            
 360846572000010382300421050100000311808+00000001429+00000040292+        00000000000+  058750+00000031916+            21051500000000000+00000000000+00000000000+00000000000+
H210515045808          BURFS11   MPLLRX300 21051522161700  5202115             *                                                                                            
 61559498007-25008813  21050100005546513+00000015600+00000000000+        00000000000+0 036250+00000109598+            21042800000000000+00000000000+00000000000+00000000000+

Thank you for your help.

Neal.

Kurt_Bremser · Posted 05-22-2021 03:47 AM

Is it your sole task to filter the text file, or do you intend to use the data for any analysis in SAS?

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Kurt_Bremser · Posted 05-22-2021 01:36 PM

Anyway, this works by using the header as a group key:

data have;
input line $char200.;
datalines;
H210515405510          BURFDG    MPLLRX300 21051523110000  193211350           *                                                                                            
 360846572000010382300421050100000311808+00000001429+00000040292+        00000000000+  058750+00000031916+            21051500000000000+00000000000+00000000000+00000000000+
 582150361000010391590421050100010760979+00000048200+00000027670+        00000000000+  057500+00000079366+            21051500000000000+00000000000+00000000000+00000000000+
H210515514101          BURFDG    MPLLRX300 21051523190000  114211350           *                                                                                            
 771935218000040157720321050100000445426+00000002598+00000012921+        00000000000+  072500+00000015690+            21051500000000000+00000000000+00000000000+00000000000+
H210515045808          BURFS11   MPLLRX300 21051522161700  5202115             *                                                                                            
 65498128007-25008799  21050100003931943+00000013926+00000000000+        00000000000+0 045000+00000025334+            21042600000000000+00000000000+00000000000+00000000000+
 61559498007-25008813  21050100005546513+00000015600+00000000000+        00000000000+0 036250+00000109598+            21042800000000000+00000000000+00000000000+00000000000+
 73096930407-25008963  21050100012029749+00000033834+00000000000+        00000000000+0 036250+00000067496+            21042200000000000+00000000000+00000000000+00000000000+
;
 
data isafile;
input lnno $9.;
datalines;
360846572
615594980
;

data comb;
set have;
retain header;
if substr(line,1,1) = "H"
then header = line;
else output;
run;

data want;
if _n_ = 1
then do;
  length lnno $9;
  declare hash l (dataset:"isafile");
  l.definekey("lnno");
  l.definedone();
  lnno = "";
end;
do until (last.header);
  set comb;
  by header notsorted;
  if l.check(key:substr(line,2,9)) = 0 then flag = 1;
  put flag=;
end;
do until (last.header);
  set comb(rename=(line=_line));
  by header notsorted;
  if flag
  then do;
    if first.header
    then do;
      line = header;
      output;
    end;
    if l.check(key:substr(_line,2,9)) = 0
    then do;
      line = _line;
      output;
    end;
  end;
end;
keep line;
run;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

saslovethemost · Posted 05-22-2021 07:46 PM

Thank you very much for your help!, you have awesome coding techniques. Really appreciate your time.

Kurt_Bremser · Posted 05-23-2021 02:37 AM

If you care to share the record description for both the header and detail records, I can show you how to read the file into usable SAS datasets.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Help with the data with header record and detail record.

Re: Help with the data with header record and detail record.

Re: Help with the data with header record and detail record.

Re: Help with the data with header record and detail record.

Re: Help with the data with header record and detail record.

Re: Help with the data with header record and detail record.

Re: Help with the data with header record and detail record.

Re: Help with the data with header record and detail record.

Re: Help with the data with header record and detail record.

Re: Help with the data with header record and detail record.

Registration is open

SAS Training: Just a Click Away