I'm given a horrible task of QC and reading in 48 monthly csv files which looks like below, and they are all in one csv sheet I also attached a copy of the , I don't know how to read this in as SAS data set and I really need your help.
What I am after is 2 SAS data set, one for each teams belongs to segment ABC i.e. TEAM_123_SEGMENT_ABC.SAS7BDAT & TEAM_125SEGMENT_ABC.SAS7BDAT, and I also need to check if the 4 totals in the NO OF POL FOR TEAM xxx trailer adds up for each COL1, COL2, COL3 & AMT OUTSTANDING as well as checking the trailer for NO OF POL FOR SEGMENT xxx add up for team 123 & team 124 that belongs to Segment ABC.
The data is quite messy please contact me so I can send the .csv file.
After tackling this problem for the last 4 hours I'm getting close to what I want which is 2 SAS data sets one for Team 123 of Segment ABC and one SAS data set for team 125 of Segment ABC by making the first field to be character but it needs to be numeric.
But I also want to check the sum of COL1, COL2, COL3 & AMT_OUTSTANDING against the trailers if possible in my code below and I'm much grateful for any suggestions to make this thing works.
COL1 COL2 COL3 AMT_OUT
NO OF POL FOR TEAM 124 --------------- --------------- ---------------
8288.22 8288.22 13249.152 4630.2032
To re-create the csv file please copy the text to notepad and rename the extension to .csv to test out the code.
Not sure if I totally got how your data structure looks like.
From what I understand I would propose you create 2 datasets: 1 contains the team totals, one contains the team details.
Key for joining (merging) these 2 datasets is the team#.
Once you have the data organised this way it shouldn't be too hard to create whatever report you need (just use by group processing).
I believe you shouldn't use macro language at all for a problem which can be solved without.
I hope the following code will give you at least some ideas of how you can read such "messy" data:
infile datalines dsd dlm=',' truncover;
if find(_infile_,'NO OF POL FOR TEAM','i')>0 then
input _dummy_:$1. col1 col2 col3 AMT_OUT;
input col1 col2 col3 AMT_OUT;
keep team col1 col2 col3 AMT_OUT;
NO OF POL FOR TEAM 124,6,8,10,11