Solved: Re: Removing multiple occurences based on conditions

sas1011 · Posted 11-08-2019 02:29 PM

Hello,

Can anyone provide a code that can help me remove multiple occurences from a certain field, if a condition is met.

To be more clear, here is an exampl of my data set:

Member_ID Effective_date End _date Reason

1234 01JUL2019 31AUG2019 Backed out

1234 01SEP2019 31OCT2019 Re-enrolled

1234 01SEP2019 31OCT2019 Continuous enrollment

2345 01JUL2019 31JUL2019 Cancelled membership

3456 01SEP2019 30SEP2019 Cancelled membership

4567 01JUL2019 31JU2019 Backed out

In the above sample data,

I want to count number of occurences for member_ID and if there occurs more than 1 occurence of member_ID, I want to compare the effective_dates for that ID.

Condition: If the effective_date is repeated for each member_ID in the occurences, then remove the duplicates.

Final output that I expect:

Member_ID Effective_date End _date Reason

1234 01JUL2019 31AUG2019 Backed out

1234 01SEP2019 31OCT2019 Continuous enrollment

2345 01JUL2019 31JUL2019 Cancelled membership

3456 01SEP2019 30SEP2019 Cancelled membership

4567 01JUL2019 31JU2019 Backed out

Thank you

Update: Include additional data variables for clarity.

I want to be able to choose which occurence to keep if there are multiple occurences. In the above example, I chose to keep the record that has the reason as "Continuous Enrollment".

Hope this helps in helping me

novinosrin · Posted 11-08-2019 03:09 PM

Proc SQL perhaps?

proc sql;
create table want as
select *
from have
group by Member_ID,Effective_date
having count(*)=1 or count(*)>1 and Reason='Continuous enrollment';
quit;

View solution in original post

novinosrin · Posted 11-08-2019 02:32 PM

HI @sas1011 Can you please post your sample as a table format as one instead of one column followed by another? Thank you!

sas1011 · Posted 11-08-2019 02:40 PM

Sorry. I edited it now

Reeza · Posted 11-08-2019 02:34 PM

Here are instructions on how to provide sample data as a data step:
https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

maguiremq · Posted 11-08-2019 02:38 PM

data have;
input Member_ID Effective_date :date9. End_Date :date9.;
format Effective_date date9. End_Date date9.;
datalines;
1234 01JUL2019 31AUG2019
1234 01SEP2019 31OCT2019
1234 01SEP2019 31OCT2019
2345 01JUL2019 31JUL2019
3456 01SEP2019 30SEP2019
4567 01JUL2019 31JUL2019
;

proc sort data=have out=want nodupkey dupout = want_dup;
by Member_ID Effective_date;
run;

sas1011 · Posted 11-08-2019 02:48 PM

Thank you for the response - Please see comment by @mkeintz : My situation is more realted towards the 2nd statement.

mkeintz · Posted 11-08-2019 02:43 PM

What if:

Two records have matching MEMBER_ID and EFFECTIVE_DATE, but differing END_DATEs?
Two records have matching MEMBER_ID, EFFECTIVE_DAE and END_DATE, but other variables that vary between records?

The "proc sort nodupkey" strategy works only if neither of the 2 conditions above occur.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas1011 · Posted 11-08-2019 02:46 PM

Yes - i want to compare only the member_id and the effective_date. There are other variables that vary in my dataset but if the member_id and the dates are repeated, then I want only one occurence of that record.

Thank you for pointing that out.

novinosrin · Posted 11-08-2019 02:52 PM

In that case, I think @maguiremq 's solution should do. Am i right?

@sas1011 wrote:

Yes - i want to compare only the member_id and the effective_date. There are other variables that vary in my dataset but if the member_id and the dates are repeated, then I want only one occurence of that record.

Thank you for pointing that out.

maguiremq · Posted 11-08-2019 02:53 PM

We're going to need more data, it seems.

sas1011 · Posted 11-08-2019 03:04 PM

I edited my question to include additional info.

Pardon me for not being clear - I've not posted much in this community and am still learning.

Thank you

novinosrin · Posted 11-08-2019 03:09 PM

Proc SQL perhaps?

proc sql;
create table want as
select *
from have
group by Member_ID,Effective_date
having count(*)=1 or count(*)>1 and Reason='Continuous enrollment';
quit;

sas1011 · Posted 11-08-2019 03:23 PM

This did it. Thank you 😄

SAS Innovate 2025: Register Now

SAS Training: Just a Click Away