With the following code (not exactly but the structure remains same) I could see the duplicate records in the .txt file which I'm generating. Is there a way to remove the duplicate records in the same data step? Since the program has LINK statement I'm not sure to remove the duplicate records. Appreciate if you could let me know if there is a way to split the following program into 2-3 data steps without affecting the logic. If the program has split, then I believe it will be easy to remove the duplicates.
TBH, you would be better off totally refactoring that code. Most of it is unecessary, one prime example being the write out statements (put) at the end which are being "linked" in for no apparent reason, and do exactly the same thing. Much the same as the code in the first 3 if blocks, each of them are doing exactly the same code. Its a lot of code to do something which is very simple.
1) Process data into the format you want
2) Sort this data to drop duplicates
3) Export data to file
Also, you are not writing a CSV file, do not call it such. The put statements show that the data is being written out to a fixed width file format. If this was sent to me I would reject it as non conformant. File extension should show filetype.
I was just informed that I need to create .txt file. I modified the OP now.
My question is, Will it be OK if I remove all the link statements and write the put statement right after the file statement? I never used LINK statement in the past and i'm not sure if I can remove it here.I just looking for the structure to rephrase the code and that was main intention of this post as well.
If you need to create a text file, then the file extension would be .txt, and you can refer to a fixed width text file. This makes it clear to everyone what you are working with.
Link is a bit like copying the code from that block up to where the link happens, so you can yourself copy and paste this code in each place where it is linked. The reason you will not have seen link before is because (well at least I have never) there is never a need for it. The link is trying to reduce the amount of code being created in the steps, however with other examples like:
knkbty = '6';
Appearing three times, its a bit silly.
So in terms of:
"My question is, Will it be OK if I remove all the link statements and write the put statement right after the file statement?" - not quite, you need to copy the link code to each place where it is linked. This is why I suggest a refactor.
1) Process data into the format you want - this is the merge, and the if statements.
2) Sort this data to drop duplicates - this is the new part of the process.
3) Export data to file - this is one simple data _null_; file...; put...; run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.