BookmarkSubscribeRSS Feed
shellp55
Quartz | Level 8

Hello

I have patient data sets with two file formats:  original file and correction files.  I've written a program where I import the original file, import the correction file(s) and using "update" the correction data overwrites the original.  Everything is working well on this.  However there is a third file format which is a deletion whereby instead of overwriting the original, I need to delete it based on this deletion file.

How would I go about doing that?

Thanks very much.

5 REPLIES 5
art297
Opal | Level 21

After you do the update, couldn't you merge the resulting file with the deletion file (only keeping id and identifying it as:

in=in_delete), and then including a statement indicating:

if not in_delete

?

shellp55
Quartz | Level 8

Hi Arthur

Thanks for responding.  Sorry I'm not clear on your response because the record will be in the updated file because it was in the original submission.

Or are you saying that after the combined corrections file updates the original, then use that file to say if not in delete then new data set?

Thanks.

art297
Opal | Level 21

Shellp55: The latter .. after completing the update, doing a merge in a separate datastep, and only keeping those records that aren't in the delete file.  If that doesn't make sense, then I probably don't understand what you have and/or are trying to accomplish.

Patrick
Opal | Level 21

Is this a once off taks or are you implementing some regular ETL process?

If it's ETL: That would be a common scenario. You get a file for an initial full load and then you get incremental file(s) for insert/update and delete. If it's regular ETL then please provide some more information (what you get and what you need) and also what you've done already. This would allow us to give you some more guidance.

sachin01663
Obsidian | Level 7

art297 is right.

Once you have updated your original file with the correct data, you can merge with the Deletion file and keep only those records which are in the Original file with (In 😃 statement.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1427 views
  • 0 likes
  • 4 in conversation