BookmarkSubscribeRSS Feed
zfusfeld
Fluorite | Level 6

Hello all,

 

I have a somewhat complex conditional delete that I am trying to figure out, and it is based on repeated observations, and I have three different types of repeaters that need to be pruned. Essentially, I have a dataset with people who have been interviewed at 2 times. 

I want to delete observations for all of these repeaters so that they only have one visit in the dataset, and I would like to do this conditionally on two variables (ACECAT, which is ACEs category, and INT_D, which is interview date). There are three types of repeaters that I need to remove one visit from. They can be identified as repeaters via the use of another variable, site_ID (though in this dataset I've actually restricted so that only repeaters are present).

The first type of repeater is missing ACECAT at BOTH visits - for these, it doesn't really matter which observation gets deleted, but for the sake of consistency I would like to delete the first observation and keep the last. INT_D is formatted in DDMMMYYYY format. 

the second type of repeater has answered ACECAT at one visit and is missing ACECAT in the other visit - for these types of repeaters, I want to drop the observation where ACECAT is missing. 

the third type of repeater has answered ACECAT at both visits. For these types of repeaters, I want to keep only the most recent observation and drop the last observation. 

I've been trying to think of a data step (or multiple data steps if needed) that can solve this problem, but am drawing a blank. Can anyone help?


2 REPLIES 2
maguiremq
SAS Super FREQ

Can you provide us with some example data so that we can test out code? Thanks.

mkeintz
PROC Star

@zfusfeld wrote:


the third type of repeater has answered ACECAT at both visits. For these types of repeaters, I want to keep only the most recent observation and drop the last observation. 

If your data are sorted by INT_D within each id/group, then isn't the most recent observation (to be kept) the same as the last observation (to be dropped)?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 603 views
  • 1 like
  • 3 in conversation