BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sas1011
Calcite | Level 5

Hello,

 

Can anyone provide a code that can help me remove multiple occurences from a certain field, if a condition is met.

 

 

To be more clear, here is an exampl of my data set: 

 

Member_ID             Effective_date                  End _date                              Reason

 

1234                         01JUL2019                    31AUG2019                        Backed out

1234                         01SEP2019                   31OCT2019                        Re-enrolled

1234                         01SEP2019                   31OCT2019                        Continuous enrollment

2345                         01JUL2019                    31JUL2019                         Cancelled membership

3456                         01SEP2019                   30SEP2019                        Cancelled membership

4567                         01JUL2019                    31JU2019                           Backed out

 

In the above sample data,

 

I want to count number of occurences for member_ID and if there occurs more than 1 occurence of member_ID, I want to compare the effective_dates for that ID. 

 

Condition: If the effective_date is repeated for each member_ID in the occurences, then remove the duplicates. 

 

Final output that I expect:

 

Member_ID             Effective_date                  End _date                              Reason

 

1234                         01JUL2019                    31AUG2019                        Backed out

1234                         01SEP2019                   31OCT2019                        Continuous enrollment

2345                         01JUL2019                    31JUL2019                         Cancelled membership

3456                         01SEP2019                   30SEP2019                        Cancelled membership

4567                         01JUL2019                    31JU2019                           Backed out

 

 

Thank you

 

 

Update: Include additional data variables for clarity. 

I want to be able to choose which occurence to keep if there are multiple occurences. In the above example, I chose to keep the record that has the reason as "Continuous Enrollment".

 

Hope this helps in helping me

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Proc SQL perhaps?

 

proc sql;
create table want as
select *
from have
group by Member_ID,Effective_date
having count(*)=1 or count(*)>1 and Reason='Continuous enrollment';
quit;

View solution in original post

12 REPLIES 12
novinosrin
Tourmaline | Level 20

HI @sas1011   Can you please post your sample as a table format as one instead of one column followed by another? Thank you!

sas1011
Calcite | Level 5

Sorry. I edited it now

maguiremq
SAS Super FREQ
data have;
input Member_ID Effective_date :date9. End_Date :date9.;
format Effective_date date9. End_Date date9.;
datalines;
1234 01JUL2019 31AUG2019
1234 01SEP2019 31OCT2019
1234 01SEP2019 31OCT2019
2345 01JUL2019 31JUL2019
3456 01SEP2019 30SEP2019
4567 01JUL2019 31JUL2019
;

proc sort data=have out=want nodupkey dupout = want_dup;
by Member_ID Effective_date;
run;

 

sas1011
Calcite | Level 5

Thank you for the response - Please see comment by @mkeintz : My situation is more realted towards the 2nd statement.  

mkeintz
PROC Star

What if:

  1. Two records have matching MEMBER_ID and EFFECTIVE_DATE, but differing END_DATEs?
  2. Two records have matching MEMBER_ID, EFFECTIVE_DAE and END_DATE, but other variables that vary between records?

 

The "proc sort nodupkey" strategy works only if neither of the 2 conditions above occur.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
sas1011
Calcite | Level 5

Yes - i want to compare only the member_id and the effective_date. There are other variables that vary in my dataset but if the member_id and the dates are repeated, then I want only one occurence of that record. 

 

Thank you for pointing that out.

novinosrin
Tourmaline | Level 20

 

In that case, I think @maguiremq  's solution should do. Am i right?

 

 


@sas1011 wrote:

Yes - i want to compare only the member_id and the effective_date. There are other variables that vary in my dataset but if the member_id and the dates are repeated, then I want only one occurence of that record. 

 

Thank you for pointing that out.


 

maguiremq
SAS Super FREQ

We're going to need more data, it seems.

sas1011
Calcite | Level 5

I edited my question to include additional info.

 

Pardon me for not being clear - I've not posted much in this community and am still learning.


Thank you

novinosrin
Tourmaline | Level 20

Proc SQL perhaps?

 

proc sql;
create table want as
select *
from have
group by Member_ID,Effective_date
having count(*)=1 or count(*)>1 and Reason='Continuous enrollment';
quit;
sas1011
Calcite | Level 5

This did it. Thank you 😄

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 1146 views
  • 0 likes
  • 5 in conversation