BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sas1011
Calcite | Level 5

Hello,

 

Can anyone provide a code that can help me remove multiple occurences from a certain field, if a condition is met.

 

 

To be more clear, here is an exampl of my data set: 

 

Member_ID             Effective_date                  End _date                              Reason

 

1234                         01JUL2019                    31AUG2019                        Backed out

1234                         01SEP2019                   31OCT2019                        Re-enrolled

1234                         01SEP2019                   31OCT2019                        Continuous enrollment

2345                         01JUL2019                    31JUL2019                         Cancelled membership

3456                         01SEP2019                   30SEP2019                        Cancelled membership

4567                         01JUL2019                    31JU2019                           Backed out

 

In the above sample data,

 

I want to count number of occurences for member_ID and if there occurs more than 1 occurence of member_ID, I want to compare the effective_dates for that ID. 

 

Condition: If the effective_date is repeated for each member_ID in the occurences, then remove the duplicates. 

 

Final output that I expect:

 

Member_ID             Effective_date                  End _date                              Reason

 

1234                         01JUL2019                    31AUG2019                        Backed out

1234                         01SEP2019                   31OCT2019                        Continuous enrollment

2345                         01JUL2019                    31JUL2019                         Cancelled membership

3456                         01SEP2019                   30SEP2019                        Cancelled membership

4567                         01JUL2019                    31JU2019                           Backed out

 

 

Thank you

 

 

Update: Include additional data variables for clarity. 

I want to be able to choose which occurence to keep if there are multiple occurences. In the above example, I chose to keep the record that has the reason as "Continuous Enrollment".

 

Hope this helps in helping me

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Proc SQL perhaps?

 

proc sql;
create table want as
select *
from have
group by Member_ID,Effective_date
having count(*)=1 or count(*)>1 and Reason='Continuous enrollment';
quit;

View solution in original post

12 REPLIES 12
novinosrin
Tourmaline | Level 20

HI @sas1011   Can you please post your sample as a table format as one instead of one column followed by another? Thank you!

sas1011
Calcite | Level 5

Sorry. I edited it now

maguiremq
SAS Super FREQ
data have;
input Member_ID Effective_date :date9. End_Date :date9.;
format Effective_date date9. End_Date date9.;
datalines;
1234 01JUL2019 31AUG2019
1234 01SEP2019 31OCT2019
1234 01SEP2019 31OCT2019
2345 01JUL2019 31JUL2019
3456 01SEP2019 30SEP2019
4567 01JUL2019 31JUL2019
;

proc sort data=have out=want nodupkey dupout = want_dup;
by Member_ID Effective_date;
run;

 

sas1011
Calcite | Level 5

Thank you for the response - Please see comment by @mkeintz : My situation is more realted towards the 2nd statement.  

mkeintz
PROC Star

What if:

  1. Two records have matching MEMBER_ID and EFFECTIVE_DATE, but differing END_DATEs?
  2. Two records have matching MEMBER_ID, EFFECTIVE_DAE and END_DATE, but other variables that vary between records?

 

The "proc sort nodupkey" strategy works only if neither of the 2 conditions above occur.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
sas1011
Calcite | Level 5

Yes - i want to compare only the member_id and the effective_date. There are other variables that vary in my dataset but if the member_id and the dates are repeated, then I want only one occurence of that record. 

 

Thank you for pointing that out.

novinosrin
Tourmaline | Level 20

 

In that case, I think @maguiremq  's solution should do. Am i right?

 

 


@sas1011 wrote:

Yes - i want to compare only the member_id and the effective_date. There are other variables that vary in my dataset but if the member_id and the dates are repeated, then I want only one occurence of that record. 

 

Thank you for pointing that out.


 

maguiremq
SAS Super FREQ

We're going to need more data, it seems.

sas1011
Calcite | Level 5

I edited my question to include additional info.

 

Pardon me for not being clear - I've not posted much in this community and am still learning.


Thank you

novinosrin
Tourmaline | Level 20

Proc SQL perhaps?

 

proc sql;
create table want as
select *
from have
group by Member_ID,Effective_date
having count(*)=1 or count(*)>1 and Reason='Continuous enrollment';
quit;
sas1011
Calcite | Level 5

This did it. Thank you 😄

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 1742 views
  • 0 likes
  • 5 in conversation