02-26-2016 09:52 AM
Per my attachment I am trying to extract data from this complex report format. The data I need is contained in rows as this examples:
179948 049 346 3564199769 1 3 29.00 873 9998054826 ABCX 02-24-16
179949 049 346 3565535359 1 3 75.00 214 9998094081 ABCX 02-24-16
179950 049 346 3566939725 1 3 100.00 154 9998094761 ABCX 02-24-16
179951 049 346 3567925194 1 3 10.00 212 9998093720 ABCX 02-24-16
I also need to associate the "batch label" to each record when it this label changes. For example, "BATCH ABC", "BATCH CCD" should be associated with the rows following the label.
How do I read each unique batch label and then the rows without including any other extraneous data?
As I am new to SAS this is all I have done:
DATA TEMP; INFILE IN1; INPUT @2 SEQNUM $6. @10 TRANCODE $3. @17 CLIENTNO $3. @25 LOANNO $10. @37 CARDCODE $1. @41 HITYPE $1. @47 PRNCADV $10. @61 CHECKNO $4. @68 DESC $25. @94 PAYEE $14. @111 OVERRIDE $1. @126 EFFDATE $8.; IF TRANCODE = '049';
Thanks for any help to understanding this.
02-26-2016 10:42 AM
Looks like a mess, can you not get access to the original data? If you have to go this route, then maybe a two step approach (and note I can't provide code as I am not typing that whole image file in to have some test data to play with):
Step 1, read in the whole text file as eaach row being a long string, say input row=$2000. Then you will have a dataset with a the whole file. You can then start to process each row. First drop rows with "xxxxxx" in first place. The next record, you can pull out the batch number and put it into another variable with retain to hold it over lines. Then do a count, if you hit fourth iteration of "----" then you know data has started, so start outputting those rows. Now you have a dataset with the data part in one long charater strring, and a variable for batch number. Next, for the long string start splitting it out into data elements, you can do this via substr, perl etc.
02-26-2016 10:46 AM
It can be done but as @RW9 mentioned you will have to post sample data.
You'll also have to check all your unique cases, so I see two different layouts in your sample, but assuming there's likely more.
One option is to read all in as text using the _infile_ automatic variable, and then parse it out.
I'm not sure what your code below was supposed to do, but its not going to read that file.
03-16-2016 03:15 PM
Based on initial comments and additional research I had figured this out. Below is the high level approach I used:
1. Used data step to input each line as $133. string
2. Selected lines to process by using index function - i.e, index(string, "abc") or index(string, "def") or index(string, "ghi")
3. Saved data to flat file
4. Processed flat file via code below to select the record data
TYPE $1. @;
IF TYPE = 'P' THEN INPUT
@11 BATCH $9.;
IF TYPE IN('1','2','3','4','5','6','7','8','9','0');
@1 SEQNUM $6.
@9 TRANCODE $3.
@16 CLIENTNO $3.