Hello everyone,
In my experience building SAS processes I spend lots of time tracking down issues/conditions that drop/include records When the process involves multiple steps. Any ideas or bests practices will be appreciated
Regards,
Dario
It would be easier to help you if you provided some example code, or at least a skeleton code to show a sequence of steps, because it is not quite clear whar the problem is. But you might have fallen into the bad habit to reuse your input dataset as output.
In the following example it is impossible to see what happens where:
data want; set have;
if <drop condition> then delete;
run;
data want; set want;
if <another drop condition> then delete;
run;
data want; set want;
if <keep contition> then output;
run;
Use different output data sets and keep the dropped records for control purposes. Then it is easy to track down what happens where:
data want1 drop1; set have;
if not <drop condition> then output want1;
else output drop1
run;
data want2 drop2; set want1;
if not <another drop condition> then output want2
else output drop2;
run;
data want3 drop3; set want2;
if <keep condition> then output want3;
else output drop3;
run;
When everything works you can change your code to avoid creation of drop-datasets.
@dario_medina wrote:
Thank you Erik, more than specific code it’s related to a general good practice, this is the problem: I build large pieces of code pulling data from multiple sources, then applying business rules in 2+ data/proc steps.
During testing or after process is in production I get questions along these lines:
- why my specific account doesn’t show on reports?. This type of question usually leads me to dig into the code and see at what point the record(s) were dropped. I do this by running the code and then exploring each step to see what business rule made the record drop.. which is time consuming. Hence my question, my guess I’m not the first one asking
Thank you for replying,
Darío
In addition to @smijoss1 suggestion for records removed in a data step you might consider instead of a simple delete to write the records to a removed data set. A very broad outline of this approach:
data continue (drop=<list of the removal flag variables>) removed ; set have; <set removal reason flags> if <and of the removal flags are set> then output removed; else output continue; run;
If records are removed in other procedures then a description of how that is done may be needed. For more targeted responses.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.