Id just like some advice and pointers on ways you would take on the following task:
I use SAS EG and have access to a SAS view on our warehouse. The table contains a record for every day that an account is open, and contains various information for that account (Date, Account_ID, Outstanding_Balance, Arrears_Amount etc etc). The original table that this comes from has new records added daily, so it is sorted in date order. We have rows going back to 2012.
I need to create monthly generation files that contain information for an account where something specific has changed from the day before. For example, if on 21st February 2017 an Account_ID has Arrears_Amount of 0 and then on the 22nd February that Account_ID has arrears amount of 50, I would want to create a variable called Arrears_Changed and set that to "Y" and then push that row of data into the generational dataset with all other changes that occurred in February 2017.
My challenges are that the dataset is large and not sorted by account_id and that it's a view, so efficiency is really the key here.
Any suggestions greatly appreciated.
If your not changing the original source then:
- Extract the required information into a dataset local to you (i.e. small part, only what you need)
- Sort and process data
- Output to file
If you need to change the original data, then add that in as part of the capture process.
You saying that the original table is sorted by date (but not necessarily by account within date), is that correct? Are you also saying that your view is not also sorted by date? (And I take it that this is a proc sql view, not a data step view).
If the view is sorted by date, than a data step reading that view and using a hash object can easily track the latest_balance by account for each incoming row/observation - i.e. one hash "dataitem" per account. One can also do this with an unsorted view as well, but the hash object will take a lot more memory, depending on the number of records for each account.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.