09-26-2012 06:15 PM
I have patient data sets with two file formats: original file and correction files. I've written a program where I import the original file, import the correction file(s) and using "update" the correction data overwrites the original. Everything is working well on this. However there is a third file format which is a deletion whereby instead of overwriting the original, I need to delete it based on this deletion file.
How would I go about doing that?
Thanks very much.
09-26-2012 08:03 PM
After you do the update, couldn't you merge the resulting file with the deletion file (only keeping id and identifying it as:
in=in_delete), and then including a statement indicating:
if not in_delete
09-27-2012 03:17 PM
Thanks for responding. Sorry I'm not clear on your response because the record will be in the updated file because it was in the original submission.
Or are you saying that after the combined corrections file updates the original, then use that file to say if not in delete then new data set?
09-28-2012 08:35 PM
Shellp55: The latter .. after completing the update, doing a merge in a separate datastep, and only keeping those records that aren't in the delete file. If that doesn't make sense, then I probably don't understand what you have and/or are trying to accomplish.
09-28-2012 11:12 PM
Is this a once off taks or are you implementing some regular ETL process?
If it's ETL: That would be a common scenario. You get a file for an initial full load and then you get incremental file(s) for insert/update and delete. If it's regular ETL then please provide some more information (what you get and what you need) and also what you've done already. This would allow us to give you some more guidance.
09-29-2012 08:12 AM
art297 is right.
Once you have updated your original file with the correct data, you can merge with the Deletion file and keep only those records which are in the Original file with (In =) statement.